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Form PTO 1390 U.S. DEPARTMENT OF COMMERCE PATENT AND TRADEMARK OFFICE 
(REV 5-93) 

TRANSMITTAL LETTER TO THE UNITED STATES 
DESIGNATED / ELECTED OFFICE (DO/EO/US) 
CONCERNING A FILING UNDER 35 U.S.C. 371 


ATTORNEY'S DOCKET NUMBER 

B45168 


U.S. APPLICATION NO. (If known, see 37 C.F.R. 1.5) 

09/868604 


INTERNATIONAL APPLICATION NO 

PCT/EP99/10297 


INTERNATIONAL FILING DATE 

21 December 1999 


PRIORITY DATE CLAIMED 

21 December 1998 


TITLE OF INVENTION 

VACCINE 


APPLICANT(S) FOR DO/EOAJS 

Alex BOLLEN, Alain F AUCONNIER , and Edmond GODFROID 



Applicant herewith submits to the United States Designated/Elected Office (DO/EO/US) the following items 
and other information: 



1 [x] This is a FIRST submission of items concerning a filing under 35 U.S.C. 371. 

2. [ ] This is a SECOND or SUBSEQUENT submission of items concerning a filing under 35 U.S.C. 371. 

3. [x] This express request to begin national examination procedures (35 U.S.C. 371(f)) at any time rather 

than delay examination until the expiration of the applicable time limit set in 35 U.S.C. 371(b) and PCT 
Articles 22 and 39(1). 

4. [x] A proper Demand for International Preliminary Examination was made by the 19th month from the 

earliest claimed priority date. 

5. [x] A copy of the International Application as filed (35 U.S.C. 371(c)(2)) 

a. [ ] is transmitted herewith (required only if not transmitted by the International Bureau). 

b. [x] has been transmitted by the International Bureau. 

c. [ ] is not required, as the application was filed in the United States Receiving Office (RO/US). 

6. [ ] A translation of the International Application into English (35 U.S.C. 371(c)(2)). 

7. [x] Amendments to the claims of the International Application under PCT Article 19 (35 U.S.C. 371(c)(3)) 

a. [ ] are transmitted herewith (required only if not transmitted by the International Bureau). 

b. [x] have been transmitted by the International Bureau. 

c. [ ] have not been made; however, the time limit for making such amendments has NOT expired. 

d. [ ] have not been made and will not be made. 

8. [ ] A translation of the amendments to the claims under PCT Article 19 (35 U.S. C. 371(c)(3)). 

9. [ ] An oath or declaration of the inventor(s) (35 U.S.C. 371(c)(4)). 

10. [1 A translation of the annexes to the International Preliminary Examination Report under PCT Article 36 

(35 U.S.C. 371(c)(5)). 

Items 11. to 16. below concern other document(s) or information included: 

11. [x] An Information Disclosure Statement under 37 C.F.R. 1.97 and 1.98; and Form PTO- 1449. 

12. [ ] An assignment document for recording. A separate cover sheet in compliance with 37 C.F.R. 

3.28 and 3.31 is included. 

13. [x] A FIRST preliminary amendment, 

14. [ ] A SECOND or SUBSEQUENT preliminary amendment. 

15. [x] Please amend the specification by inserting before the first line the sentence: This is a 371 of 

International Application PCT/EP99/ 10297, filed December 21, 1999, which claims benefit 
from the following Provisional Application: GB 9828217.1 filed 21 December 1998. 

16. [ ] A substitute specification. 

17. [ ] A change of power of attorney and/or address letter. 

18. [x] An Abstract on a separate sheet of paper. 

19. [x] Other items or information: Sequence Listing, Statement to Support, Diskette 
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US APPLICATION NO. (if known see 3|£H^1 .56) INTERNATIONAL APPLICATION NO. 

09 / R A 8 O U 4 PCT/EP99/10297 


ATTORNEYS DOCKET NO 

B45168 


20. [X] The following fees are submitted: 


CALCULATIONS PTO USE ONLY 


Basic National Fee (37 C.F.R. 1.492(a)(l)-(5)): 


$710.00 


Search Report has been prepared by the EPO or JPO $860.00 


International Preliminary Examination Fee paid to USPTO (37 CFR 1.482) 
$690.00 


No International Preliminary Examination Fee paid to USPTO (37 CFR 1.482) 
but international Qparrh fee naid to USPTO (%1 CFR 1 445faV9 ii 

$710.00 


Neither Tnrematinnal Prelimin^rv x fiminntirin Pee f\7 ( ^ PR 1 4R'7^ nnr 

international search fee (37 CFR 1.445(a)(2)) paid to USPTO $1,000.00 


International Preliminary Examination Fee paid to USPTO (37 CFR 1.482) and 
all claims satisfied provisions of PCT Article 33(2)-(4) $100.00 


ENTER APPROPRIATE BASIC FEE AMOUNT = 


$710.00 




Surcharge of $130.00 for furnishing the oath or declaration later than F 1 20 PI 30 
months from the earliest claimed priority date (37 CFR 1.492(e)). 


$0.00 




Claims 


Number Filed 


Number Extra 


Rate 






Total claims 


99 - 20 = 


79 


79 x $18.00 


$1422.00 




Independent 
claims 


10-3 = 


7 


7 x $80.00 


$560.00 . 




Multiple dependent claims (if applicable) 


+ $270.00 


$270.00 




TOTAL OF ABOVE CALCULATIONS = 


$2252.00 




Reduction by 1/2 for filing by small entity, if applicable. Verified Small Entity 
statement must also be filed. (Note 37 CFR 1.9, 1.27, 1.28). 


$ 




SUBTOTAL = 


$2962.00 




Processing fee of $130.00 for furnishing the English translation later than 

□ 20 □ 30 months from the earliest claimed priority date (37 CFR 1.492(f)) + 


$ 




TOTAL NATIONAL FEE = 


$2962.00 






Amount to be 
refunded 


$ 


charged 


$ 



a. □ A check in the amount of $ to cover the above fees is enclosed. 

b. 03 Please charge my Deposit Account No. 19-2570 in the amount of $2962.00 to cover the above fees. 



A duplicate copy of this sheet is enclosed. 

c. ^ The Commissioner is hereby authorized to charge any additional fees which may be required, or 

credit any overpayment to Deposit Account No. 19-2570 . A duplicate copy of this sheet is enclosed. 

d. £3 General Authorization to charge any and all fees under 37 CFR 1. 16 or 1.17, including petitions for 

extension of time relating to this application (37 CFR 1.136 (a)(3)). 



NOTE: Where an appropriate time limit under 37 CFR 1.494 or 1.495 has not been met, a petition to 
revive (37 CFR 1.137(a) or (b)) must be filed and granted to restore the application tojiending status. 

SEND ALL CORRESPONDENCE TO: 
GL^XOSMIII^LINE 
Cfj rporate Intellectua l^opertv^- UyV!222Q 
f\CXBox 15 2SL ~^ — — 
King^fJ^^^ 
Phone (610) 270-5024 
Facsimile (610) 270-5090 



SIGNATURE 
Zoltan Kerekes 

NAME 

REGISTRATION NO. 
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DATE OF DEPOSIT: 20 June 2001 

Attorney Docket No. B45168 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Applicant: Bollen, et al. 20 June 2001 

Intl. App. No.: PCT/EP99/ 10297 Group Art Unit: Not Yet Assigned 

Intl. Filing Date: 21 December 1999 Examiner: Not Yet Assigned 

For: VACCINE 

Assistant Commissioner of Patents 
Box: PCT 

Washington, D.Q 20231 

PRELIMINARY AMENDMENT 

Preliminary to the examination of this application, Applicants respectfully request 
consideration and entry of the following Preliminary Amendment. 

Applicants are submitting herewith a new Statement to Support Filing and Submission in 
Accordance with 37 CFR §§ 1.821 Through 1.825, which includes three corrected sheets (pages 
111,113, and 1 14) pursuant to 37 CFR §§ 1 .825 . In addition, Applicants are submitting the 
complete Sequence Listing on computer diskette. 

IN THE SPECIFICATION 

Please amend Table 3 on page 38 as follows: 



Table 3 



Names 


Coding sequence 
from/to (with 
reference to Fig. 5) 


Coding 
DNA strand 


SEQ ID 
NO: 


Homologous genes (from 
Yersinia^ unless otherwise 
specified) 


Class II ORFs which putatively code for effector proteins 


bopN 


11906/13003 


complement 


41 


YopN( = lcrE) 


orfl 


6160/6747 


direct 


43 


None 


orf2 


10752/11120 


complement 


45 


None 
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orf3 


11117/11527 


complement 


47 


None 


orf4 


11532/11909 


complement 


49 


None 


orf5 


13002/13784 


direct 


51 


None 


orf6 


13806/14081 


direct 


53 


None 


or/7 


14630/15571 


direct 


55 


None 


or/8 


15601/16803 


direct 


57 


None 


orf9 


16827/17288 


direct 


59 


BcrH 


orflO 


17293/17814 


direct 


61 


pcr4 

(Pseudomonas aeruginosa) 


orfll 


29412/29591 


complement 


63 


None 


orfll 


29555/30529 


complement 


65 


None 


orfl3 


30631/31776 


direct 


67 


None 


or/14 


31818/33005 


complement 


69 


None 


orfl5 


32370/33014 


direct 


71 


None 



IN THE CLAIMS : 

Please cancel claims 1-29. 
Please add new claims 30-78. 

30. An isolated polypeptide comprising an amino acid sequence which has at least 75% 
identity to the amino acid sequence selected from the group consisting of: SEQ ID NO:42, 44, 
46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 and 72 over its entire length. 

3 1 . The polypeptide as claimed in claim 30 comprising the amino acid sequence selected 
from the group consisting of: SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 
70 and 72. 

32. An isolated polypeptide of SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 
66, 68, 70 or 72. 
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33. An isolated polypeptide comprising a fragment of at least 7 consecutive amino acids of 
the polypeptide as claimed in any one of claims 30 to 32, wherein the fragment comprises an 
epitope. 

34. The polypeptide of claim 33, wherein the fragment is immunogenic. 

35. An isolated polynucleotide comprising a nucleotide sequence encoding a polypeptide that 
has at least 75% identity to the amino acid sequence of SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 
58, 60, 62, 64, 66, 68, 70 or 72 over its entire length; or a nucleotide sequence complementary to 
said isolated polynucleotide. 

36. An isolated polynucleotide comprising a nucleotide sequence that has at least 75% identity 
to a nucleotide sequence, encoding a polypeptide of SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 
58, 60, 62, 64, 66, 68, 70 or 72, over its entire length; or a nucleotide sequence complementary 
to said isolated polynucleotide. 

37. An isolated polynucleotide which comprises a nucleotide sequence which has at least 
75% identity to that of SEQ ID NO:41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 or 
71 over its entire length; or a nucleotide sequence complementary to said isolated polynucleotide. 

38. The isolated polynucleotide as claimed in claim 35 in which the identity is at least 95% 
to SEQ ID NO:41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 or 71 over its entire 
length. 

39. The isolated polynucleotide as claimed in claim 36 in which the identity is at least 95% 

to SEQ ID NO:41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 or 71 over its entire 

< 

length. 

40. The isolated polynucleotide as claimed in claim 37 in which the identity is at least 95% 
to SEQ ID NO:41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 or 71 over its entire 
length. 
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41. An isolated polynucleotide comprising a nucleotide sequence encoding the polypeptide of 
SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 or 72. 

42. An isolated polynucleotide comprising the polynucleotide of SEQ ID NO:41, 43, 45, 
47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 or 71. 

43. An isolated polynucleotide comprising a nucleotide sequence encoding the polypeptide of 
SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 or 72, obtainable by 
screening an appropriate library under stringent hybridization conditions with a labeled probe 
having the sequence of SEQ ID NO:41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 or 
71 or a fragment thereof. 

44. An expression vector comprising an isolated polynucleotide according to any one of 
claims 35-43. 

45. A recombinant live microorganism comprising an isolated polynucleotide according to 
any one of claims 35-43. 

46. A host cell comprising the expression vector of claim 44 or a subcellular fraction or a 
membrane of said host cell. 

47. A process for producing the polypeptide of claim 30 comprising the steps of culturing a 
host cell of claim 46 under conditions sufficient for the production of said polypeptide and 
recovering the polypeptide from the culture medium. 

48. A process for expressing a polynucleotide of any one of claims 35-43 
comprising transforming a host cell with an expression vector comprising at least 
one of said polynucleotides and culturing said host cell under conditions sufficient 
for expression of any one of said polynucleotides. 

49. A vaccine composition comprising an effective amount of the polypeptide 
of claim 30 and a pharmaceutically acceptable carrier. 
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50. A vaccine composition comprising an effective amount of the polypeptide 
of claim 31 and a pharmaceutically acceptable carrier. 

51. A vaccine composition comprising an effective amount of the polypeptide 
of claim 32 and a pharmaceutically acceptable carrier. 

52. A vaccine composition comprising an effective amount of the polypeptide 
of claim 33 and a pharmaceutically acceptable carrier. 

53. A vaccine composition comprising an effective amount of the polypeptide 
of claim 34 and a pharmaceutically acceptable carrier. 

54. The vaccine composition of claim 49, wherein the polypeptide has an 
amino acid sequence selected from the group consisting of: SEQ ID NO:42, 46, 
48, 50, 52, 54, 56, 58, 60 and 62. 

55. A vaccine composition comprising an effective amount of the polynucleotide of any one 
of claims 35 to 43 and a pharmaceutically acceptable carrier. 

56. The vaccine composition according to any one of claims 49-55, wherein 
said composition comprises at least one other Bordetella pertussis antigen. 

57. An antibody immunospecific for the amino acid sequence of claim 30 or 3 1 . 

58. An antibody immunospecific for the polypeptide of claim 32. 

59. An antibody immunospecific for the fragment of claim 33. 

60. An antibody immunospecific for the fragment of claim 34. 

61 . A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 30, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 
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62. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 31, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 

63. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 32, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 

64. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 33, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 

65. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 34, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 

66. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 30 and a suitable 
pharmaceutical carrier. 

67. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 31 and a suitable 
pharmaceutical carrier. 

68. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 32 and a suitable 
pharmaceutical carrier. 

69. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 33 and a suitable 
pharmaceutical carrier. 
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70. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 34 and a suitable 
pharmaceutical carrier. 

71. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polynucleotide of claims 35-43. 

72. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 30. 

73. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 3 1 . 

74. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 32. 

75. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 33. 

76. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 34. 

77. A method of identifying virulence genes from a pathogenicity island containing a type 
III secretion system from pathogenic strains of bacteria, comprising: 

designing degenerate PCR primers complementary to well-conserved regions specific 
to the LcrD polypeptide of Yersinias 

amplifying the polynucleotide containing the DNA sequence between (and including 
the DNA sequence of) the primers of IcrD-like genes present in said pathogenic strain 
of bacteria; 

sequencing the IcrD-like gene; 

determining whether the DNA sequence is more homologous: to the virulence- 
associated family of IcrD-like genes, or to the flagellar-associated family of IcrD-like 
genes; and 
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if a virulence-associated member, sequencing the entire pathogenicity island, and 
identifying genes within this sequence. 

78. A method of determining whether a particular bacterial strain harbours a type III 
secretion system involved in pathogenicity, comprising: 

designing degenerate PCR primers complementary to well-conserved regions specific 

to the LcrD polypeptide of Yersinia; 

amplifying the polynucleotide containing the DNA sequence between (and including 
the DNA sequence of) the primers to determine the presence of any IcrD-hke genes in 
said bacterial strain; 

if amplified successfully, sequencing the ZcrZMike gene; and 

determining whether the DNA sequence is more homologous: to the virulence- 
associated family of /trZMike genes, or to the flagellar-associated family of IcrD-like 
genes. 

REMARKS 

The above-identified application is being entered into the National Phase from PCT 
application no. PCT/EP99/ 10297. 

Specification and Sequence Listing 

An inadvertant error on page 38 (Table 3) which recites position 3 1773 as the end of the 
open reading frame for or/14 has been corrected to show position 31818. As set forth below, 
correcting this error is obvious. 

Applicants respectfully request amendment of SEQ ID NO:69 and SEQ ID NO:70 as set 
forth on substitute sheets 1 1 1, 1 13, and 1 14 submitted herewith pursuant to 37 CFR §§ 1.825. 

In the original submission of the Sequence Listings, the Sequence Listing program 
completely ignored the stop codon at position 396. Although the sequence shown in Fig. 5 is 
correct, Table 3 states that orfl4 is encoded in the complementary strand from position 31773- 
33005. Although position 33005 indicates the correct start codon for this open reading frame, the 
end of the open reading frame is incorrectly stated. It should actually be at position 31818- 
where the first stop codon (at position 396) is encountered. 

Correcting this error is obvious. Given a properly stated start codon, and a correct DNA 
sequence, anyone would realize that the open reading frame MUST end where the first, in-frame, 



8 




Intl. App. No.: PCT/EP99/ 10297 
Docket No. B45168 

stop codon in the sequence is encountered. This is obviously at position 396, and was clearly the 
intention of the Applicants at the time of filing the international application. 

Three replacement sheets are provided as required under 37 CFR §§ 1.825. The error on 
page 111 has been rectified to indicate the number of nucleotides in SEQ ID NO:69 as being 
1188. The error on page 113 has been rectified to indicate that the open reading frame does not 
extend past the stop codon at position 396. In addition, the number of amino acids in SEQ ID 
NO:70 has been rectified as being 395. The error on page 1 14 has been rectified to indicate the 
last amino acid in the open reading frame as being His395. 

The corrected sequence listing for SEQ ID NO: 69 and 70 does not go beyond the 
disclosure apparent to anyone from Fig. 5 and Table 3 of the International Application as filed. 

The complete sequence listing is provided on a computer diskette. It is identical to the 
original written sequence listing in conjunction with the aforementioned corrections to SEQ ID 
NO: 69 and 70. 
Claims 

Claims 1-29 were cancelled. New claims 30-78 were added for the following reason: to 
put the claims in conformity with U.S. practice. 
No new matter has been introduced. 

Attached hereto is a marked-up version of the changes made to the specification and 
claims by the current amendment. The attached page is captioned "Version with Markings to 
Show Changes Made". Applicants respectfully request that a timely Notice of Allowance be 
issued in this case. 



GLAXOSMITHKLINE 

Corporate Intellectual Property - UW2220 

P.O. Box 1539 

King of Prussia, PA 19406-0939 
Phone (610) 270-5024 
Facsimile (610) 270-5090 




:Zoltan Kerekes <T 
Attorney for Applicants 
Registration No. 38,938 
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VERSION WITH MARKINGS TO SHOW CHANGES 
IN THE SPECIFICATION: 

Table 3 appearing on page 38 has been amended as follows: 



Table 3 



Names 


Coding sequence 
from/to (with 
reference to Fig. 5) 


Coding 
DNA strand 


SEQ ID 
NO: 


Homologous genes (from 
Yersinia, unless otherwise 
specified) 


Class II C 


iRFs which putatively code for effector proteins 


bopN 


11906/13003 


complement 


A 1 

41 


lOpiy \ — ICrFL) 


orfl 


6160/6747 


direct 


43 


None 


orfl 


10752/11120 


complement 


A £ 


None 


orfl 


11117/11527 


complement 


47 


None 


orfl 


11532/11909 


complement 


49 


None 


orfl 


13002/13784 


direct 


51 


None 


or/6 


13806/14081 


direct 


53 


None 


orfl 


14630/15571 


direct 


55 


None 


orfl 


15601/16803 


direct 


57 


None 


orfl 


16827/17288 


direct 


59 


BcrH 


orflO 


17293/17814 


direct 


61 


pcr4 

(Pseudomonas aeruginosa) 


orfll 


29412/29591 


complement 


63 


None 


orfl 2 


29555/30529 


complement 


65 


None 


orflS 


30631/31776 


direct 


67 


None 


orfl 4 


[31773131818/330 
05 


complement 


69 


None 


orfl 5 


32370/33014 


direct 


71 


None 
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IN THE SEQUENCE LISTING: 



IN THE CLAIMS: 

Claims 1-29 have been cancelled. New claims 30-78 have been added as follows: 

30. An isolated polypeptide comprising an amino acid sequence which has at least 75% 
identity to the amino acid sequence selected from the group consisting o f: SEP ID NP:42, 44, 
46. 48, 50, 52. 54, 56. 58, 60, 62, 64. 66. 68. 70 and 72 over its entire length. 

3 1 . The polypeptide as claimed in claim 30 comprising the amino acid sequence selected 
from the group consisting of: SEP ID NO:42. 44, 46. 48. 50, 52. 54, 56, 58. 60. 62, 64, 66, 68, 
70 and 72. 

32. An isolated polypeptide of SEP ID NP:42. 44. 46. 48. 50. 52. 54. 56, 58, 60, 62, 64, 
66. 68. 70 or 72. 

33. An isolated polypeptide comprising a fragment of at least 7 consecutive amino acids of 
the polypeptide as claimed in any one of claims 30 to 32, wherein the fragment comprises an 
epitope. 

34. The polypeptide of claim 33. wherein the fragment is immunogenic. 

35. An isolated polynucleotide comprising a nucleotide sequence encoding a polypeptide that 
has at least 75% identity to the amino acid sequence of SEP ID NP:42. 44. 46. 48 , 50. 52. 54, 56, 
58. 60, 62. 64. 66. 68. 70 or 72 over its entire length; or a nucleotide sequence complementary to 
said isolated polynucleotide. 

36. An isolated polynucleotide comprising a nucleotide sequence that has at least 75% identity 
to a nucleotide sequence, encoding a polypeptide of SEP ID NP:42. 44. 46. 48, 50. 52. 54, 56. 
58. 60, 62, 64, 66, 68, 70 or 72, over its entire length; or a nucleotide sequence complementary 
to said isolated polynucleotide. 
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37. An isolated polynucleotide which comprises a nucleotide sequence which has at least 
75% identity to that of SEP ID NP:41, 43, 45, 47. 49, 51, 53. 55, 57. 59, 61. 63. 65, 67. 69 or 
71 over its entire length; or a nucleotide sequence complementary to said isolated polynucleotide. 

38. The isolated polynucleotide as claimed in claim 35 in which the identity is at least 95% 
to SEP ID NQ:41. 43, 45, 47. 49, 51. 53. 55. 57, 59, 61, 63, 65, 67, 69 or 71 over its entire 
length. 

39. The isolated polynucleotide as claimed in claim 36 in which the identity is at least 95% 
to SEP ID NP:41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 or 71 over its entire 
length. 

40. The isolated polynucleotide as claimed in claim 37 in which the identity is at least 95% 
to SEP ID NP:41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63. 65, 67, 69 or 71 over its entire 
length. 

41 . An isolated polynucleotide comprising a nucleotide sequence encoding the polypeptide of 
SEP ID NQ:42, 44, 46, 48, 50, 52. 54, 56, 58, 60, 62. 64, 66, 68, 70 or 72. 

42. An isolated polynucleotide comprising the polynucleotide of SEP ID NP:41, 43, 45, 
47. 49, 51, 53. 55. 57, 59, 61, 63, 65, 67. 69 or 71. 

43. An isolated polynucleotide comprising a nucleotide sequence encoding the polypeptide of 
SEP ID NP:42, 44. 46. 48. 50. 52. 54. 56. 58, 60, 62, 64, 66, 68. 70 or 72. obtainable by 
screening an appropriate library under stringent hybridization conditions with a labeled probe 
having the sequence of SEP ID NP:41, 43. 45, 47, 49, 51, 53, 55, 57, 59, 61, 63. 65, 67, 69 or 
71 or a fragment thereof. 

44. An expression vector comprising an isolated polynucleotide according to any one of 
claims 35-43. 
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45. A recombinant live microorganism comprising an isolated polynucleotide according to 
any one of claims 35-43. 

46. A host cell comprising the expression vector of claim 44 or a subcellular fraction or a 
membrane of said host cell. 

47. A process for producing the polypeptide of claim 30 comprising the steps of culturing a 
host cell of claim 46 under conditions sufficient for the production of said polypeptide and 
recovering the polypeptide from the culture medium. 

48. A process for expressing a polynucleotide of any one of claims 35-43 
comprising transforming a host cell with an expression vector comprising at least 
one of said polynucleotides and culturing said host cell under conditions sufficient 
for expression of any one of said polynucleotides. 

49. A vaccine composition comprising an effective amount of the polypeptide 
of claim 30 and a pharmaceutically acceptable carrier. 

50. A vaccine composition comprising an effective amount of the polypeptide 
of claim 3 1 and a pharmaceutically acceptable carrier. 

51. A vaccine composition comprising an effective amount of the polypeptide 
of claim 32 and a pharmaceutically acceptable carrier. 

52. A vaccine composition comprising an effective amount of the polypeptide 
of claim 33 and a pharmaceutically acceptable carrier. 

53. A vaccine composition comprising an effective amount of the polypeptide 
of claim 34 and a pharmaceutically acceptable carrier. 

54. The vaccine composition of claim 49, wherein the polypeptide has an 
amino acid sequence selected from the group consisting of: SEP ID NO:42. 46, 
48. 50. 52. 54. 56. 58, 60 and 62. 
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55. A vaccine composition comprising an effective amount of the polynucleotide of any one 
of claims 35 to 43 and a pharmaceutical^ acceptable carrier. 

56. The vaccine composition according to any one of claims 49-55, wherein 
said composition comprises at least one other Bordetella pertussis antigen. 

57. An antibody immunospecific for the amino acid sequence of claim 30 or 3 1 . 

58. An antibody immunospecific for the polypeptide of claim 32. 

59. An antibody immunospecific for the fragment of claim 33. 

60. An antibody immunospecific for the fragment of claim 34. 

61 . A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 30, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 

62. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 31, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 

63. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 32, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 

64. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 33, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 
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65. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed claim 34, or an antibody that is immunospecific for said polypeptide, 
present within a biological sample from an animal suspected of having such an infection. 

66. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 30 and a suitable 
pharmaceutical carrier. 

67. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 3 1 and a suitable 
pharmaceutical carrier. 

68. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 32 and a suitable 
pharmaceutical carrier. 

69. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 33 and a suitable 
pharmaceutical carrier. 

70. A therapeutic composition useful in treating humans with Bordetella pertussis disease 
comprising at least one antibody directed against the polypeptide of claim 34 and a suitable 
pharmaceutical carrier. 

71. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polynucleotide of claims 35-43. 

72. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 30. 

73. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 3 1 . 
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74. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 32. 

75. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 33. 

76. A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polypeptide of claim 34. 

77. A method of identifying virulence genes from a pathogenicity island containing a type 
III secretion system from pathogenic strains of bacteria, comprising: 

designing degenerate PCR primers complementary to well-conserved regions specific 
to the LcrD polypeptide of Yersinias 

amplifying the polynucleotide containing the DNA sequence between (and includin g 
the DNA sequence of) the primers of IcrD-Wke genes present in said pathogenic strain 
of bacteria; 

sequencing the IcrD-liks gene; 

determining whether the DNA sequence is more homologous: to the virulence- 
associated family of IcrD-like genes, or to the flagellar-associated family of ZcrD-like 
genes; and 

if a virulence-associated member, sequencing the entire pathogenicity island, and 
identifying genes within this sequence. 

78. A method of determining whether a particular bacterial strain harbours a type III 
secretion system involved in pathogenicity, comprising: 

designing degenerate PCR primers complementary to well-conserved regions specific 
to the LcrD polypeptide of Yersinia: 

amplifying the polynucleotide containing the DNA sequence between (and including 
the DNA sequence of) the primers to determine the presence of any IcrD-like genes in 
said bacterial strain: 

if amplified successfully, sequencing the IcrD-like gene; and 
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determining whether the DNA sequence is more homologous: to the virulence- 
associated family of IcrD-Yike genes, or to the flagellar-associated family of /criMike 
genes. 
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Abstract 

This invention relates to a general method for detecting pathogenic strains of 
bacteria which harbour a type III secretion system. More particularly, this invention relates 
to the methods as applied to the pathogen Bordetella pertussis. Furthermore, the invention 
relates to newly identified polynucleotides within these regions, virulent polypeptides 
encoded by them and to the use of such polynucleotides and polypeptides, and to their 
production. More particularly the polynucleotides and polypeptides of the present invention 
relate to the virulent effector proteins associated with the type III secretion system of 
Bordetella pertussis, which are particularly suitable for vaccine purposes. 
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FIELD OF INVENTION 

This invention relates to a general method for detecting pathogenic strains of 
bacteria that harbour a type III secretion system, and characterising regions of the 
chromosome of said strain where virulence genes reside. More particularly, this 
invention relates to the method as applied to the pathogen Bordetella pertussis. 
Furthermore, the invention relates to newly identified polynucleotides within these 
regions, virulent polypeptides encoded by them and to the use of such polynucleotides 
and polypeptides, and to their production. 



BACKGROUND OF THE INVENTION 



Type III secretion systems: 

Pathogenic bacteria invade many different niches in a broad host range and cause 
a wide variety of syndromes. It is due to this fact that it was believed previously that each 
disease might be induced by a distinct molecular mechanism. However, the spectrum of 
such mechanisms is not as broad as first imagined; rather, bacteria exploit a number of 
common molecular tools to achieve a range of goals. Among these tools are type III 
secretion systems, which provide a means for bacteria to target virulence factors directly 
at host cells. These factors then tamper with host cell functions to the pathogens' benefit. 

The type III export system is responsible for secretion of Salmonella and Shigella 
invasion and virulence factors, Enteropathogenic Escherischia coli (EPEC) signal 
transduction molecules, virulence factors in several plant pathogens (for instance 
Xanthomonas campestris pv. vesicatoria [Fenselau et aL, 1992]) and Yops proteins in 
Yersinia. Yops export mechanism has been the most intensively investigated type III 
secretion apparatus (see for instance: Allaoui et aL, 1994; Bergman et al. 9 1994). In this 
system, more than 20 different Ysc/Lcr proteins, all encoded by the virulence plasmid 
pYV, are presumed to compose a secretion channel spanning the Yersinia cell envelope. 
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Besides these elements involved in the secretion rnachinary, the pYV plasmid codes for 
the Yops proteins which are the secreted substrates and appear as the actual effectors of 
virulence. 

5 Comparative studies of type III secretion systems originating from different 

species reveal that the components of the secretion machinery are conserved (Gygi et aL, 
1995; Bogdanove et al„ 1996). In addition, homologs have been found in determinants 
which take part in flagellar assembly, indicating that this secretion pathway may be 
involved in surface organelle biosynthesis (Ramakrishnan et aL, 1991). 

10 

In contrast, however, the secreted substrates share no similarities, except in few 
cases. Therefore, the abandoned concept of a distinct molecular mechanism 
corresponding to each disease could reappear at the level of effector proteins. 

15 

Pathogenicity island 

Pathogenicity islands have emerged as a novel theme in the field of bacterial 
virulence. Although they can comprise type III secretion systems they do not exlusively 
20 do so. 

Early in the search for virulence genes, it was observed that many of these genes 
resided on plasmids. However, numerous virulence genes were also found on the 
chromosome. Surprisingly, the chromosomal virulence genes are also often clustered in 

25 functionally related groups. Such groups of virulence genes gave rise to the concept of 
pathogenicity islands (Pais) which can be defined as compact, distinct genetic units 
carrying virulence genes. These units, often flanked by direct repeats, occupy large 
chromosomal regions (often > 30 kb) and are present in pathogenic strains, whilst being 
absent or sporadically distributed in less-pathogenic (or non-pathogenic) strains of a 

30 bacterial species. These DNA segments are frequently associated with tRNA genes 
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and/or insertion sequence (IS) elements at their boundaries. In addition, their G+C 
content often differs from that of host bacterial DNA, suggesting a foreign origin. 

Pathogenicity islands have been discovered in an increasing number of bacterial 
5 pathogens, including different categories of E. coll Salmonella typhimurium, Yersinia 
spp, Helicobacter pyloric Vibrio cholera etc. 

The first intensively studied pathogenicity islands were Pai I and Pai II, which 
encode the haemolysin determinants of uropathogenic E. coll. These two Pais, are 

1 0 flanked by direct repeats and can be deleted from the chromosome at frequencies of 10 -4 , 
resulting in non-virulent mutant strains. Another pathogenicity island of 35 kb has 
recently been identified on the chromosome of enteropathogenic E. coli (EPEC) and was 
found to encode all known determinants involved in the so-called "attaching and 
effacing" (AE) lesion formation. This region was therefore referred to as "locus of 

15 enterocyte effacing" (LEE). Despite the fact that uropathogenic and enteropathogenic E. 
coli cause completely different infectious diseases, Pai I of the uropathogenic strains and 
the LEE locus of EPEC are inserted at exactly the same positions into the E. coli 
chromosome. 

20 While some authors support a definition of pathogenicity islands which 

necessarily includes its chromosomal location, others have extended the concept to 
blocks of virulence genes, regardless of their location in chromosomes, plasmids or 
phages. The fact that, on one hand, phages and plasmids can easily insert into and excise 
from the chromosome and, on the other, that cryptic origins of plasmid replication, or 

25 phage related sequences were detected in Pais, prompted the latter and less restrictive 
definition. 

The pathogenicity islands (PAIs) which code for a type III secretion system 
encompass genes that divide into two classes, I and II. Class I encompasses the genes 
30 coding for the secretion machinery components and their regulators of expression, class 



3 



WO 00/37493 



PCT/EP99/10297 



II encompasses the genes encoding secreted effector proteins. Both Yersinia IcrD and 
yscU belong to class I. The precise functions of class I determinants is not well 
understood. Although it is sometimes not straightforward to make a clear distinction 
between class I and class II components, genes of class I can be identified as being 
5 present in many different species, and a comparison of their respective gene sequences 
indicate that equivalent genes share a significant (yscl, yscO) or even high level (IcrD, 
yscU, yscN) of sequence similarity (Hueck, 1998). 

The second class of genes (class II) codes for proteins which constitute the 
10 substrate secreted by the translocon. These proteins appear as the actual effectors of 
virulence and are referred to as target proteins, virulence effector proteins or, simply, 
effectors. In contrast to the situation prevailing in class I gene products, the effectors 
share no> or very weak, similarities between species. Effector proteins are those which 
present the best biological, vaccine and diagnostic potentialities. 

15 

The inventors have discovered that the clustering of class I and class II genes 
inside a single pathogenicity island, offers the opportunity of conveniently finding and 
characterising unknown class II genes by targeting class I genes which can be identified 
using a known sequence of one of their numerous orthologues. 

20 

Bordetella pertussis 

Whooping cough is a disease caused by infection by Bordetella pertussis^ and is a 
25 serious and debilitating human disease particularly in young children. Although whole 
cell and acellular vaccines are available that are effective against the disease, there 
remains a need for the identification of further highly purified pertussis proteins that 
could be used in a more efficacious pertussis vaccine. 
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Although many pertussis virulence associated factors are known such as pertussis 
toxin, filamentous haemagglutinin, pertactin, which have been included in various 
acellular vaccines, there is no convenient genetic method for identifying further virulence 
factors using the pertussis genome (short of laboriously sequencing the whole genome). 
5 Although class I type III secretion system virulence genes have recently been shown to 
exist in B. bronchiseptica and B. pertussis (Yuk et aL, 1998), there has been no complete 
analysis of a pathogenicity island in Bordetella, and the identity and characterisation of 
effector genes within such a pathogenicity island has been unknown up until the present 
invention. 

10 

SUMMARY OF THE INVENTION 

In one aspect, the invention relates to a method for the identification of new 
15 virulence genes in bacterial strains containing a type III secretion system. In particular, 
the invention allows the identification of the effector virulence genes associated within a 
pathogenicity island containing the genes for the type III secretion system. Another 
aspect of the invention a method for the identification of pathogenic bacterial strains 
containing a type III secretion system. Another aspect of the invention relates to 
20 Bordetella pertussis BopN, Orfl, Orf2, Orf3, Orf4, Orf5, Orf6, Orf7, Orf8, Orf9, OrflO, 
Orfll, Orfl 2, Orfl 3, Orfl 4, Orfl 5 effector proteins, and the respective polynucleotide 
sequences encoding them. 

Although the general concepts of type III secretion systems and pathogenicity 
25 islands have been reported, the problem of how simply and reliably to identify whether 
any given organism has such cell machinery has not been accomplished until now. Such 
a method is extremely useful to establish whether a given strain has a type III secretion 
system within a pathogenicity island, to characterise unknown virulence genes within the 
pathogenicity island, and to use in quick diagnostic methods for determining whether a 
30 cultured bacterial strain containing a type III secretion system is pathogenic. 
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In the present invention, a novel, general method is described to achieve the 
above aims. More specifically, the invention utilises a method that employs ideally- 
suited primers designed specifically from the sequence of the virulent Yersinia 
5 enterocolitica IcrD gene as a target sequence. The presence of a type III secretion system 
within a pathogenicity island in Bordetella pertussis was discovered, and every gene 
within the pathogenicity island was characterised. 

1 0 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1. Nucleotide and deduced amino acid sequences of the cloned 152 bp 
amplicon. The primers involved in the original amplification, the subsequent nested 
PCR, and the gene library screening are all derived from this sequence, and listed 
1 5 specifically in Table 1 . 

Fig. 2. PileUp figure from the deduced amino acid sequences homologous to 
Yersinia LcrD. Abbreviations: BbuFlhA = Borrelia burgdorferi FlhA; TpaFlhA = 
Treponema pallidum FlhA; BsuFlhA = Bacillus subtilis FlhA; CjeFlbA = Campylobacter 

20 jejuni FlbA; HpyFlhA = Helicobacter pylori FlhA; EcoFlhA - Escherichia coli FlhA; 
StyFlhA = Salmonella typhimurium FlhA; YenFlhA = Yersinia enterocolitica FlhA; 
PmiFlhA = Proteus mirabilis FlhA; CcrFlbF = Caulobacter crescentus FlbF; EcoFhiA = 
Escherichia coli FhiA; EamHrpI = Erwinia amylovora HrpI; PsyHrpI = Pseudomonas 
syringae HrpI; ECEPSepA = Enteropathogenic Escherichia coli SepA; StySsaV = 

25 Salmonella typhimurium SsaV; RsoHrpO = Ralstonia solanacearum HrpO; XcaHrpC2 = 
Xanthomonas campestris HrpC2; SflMxiA = Shigella Flexneri MxiA; StylnvA = 
Salmonella typhimurium InvA; PaePcrD = Pseudomonas aeruginosa PcrD; YenLcrD = 
Yersinia enterocolitica LcrD; BpeBcrD = Bordetella pertussis BcrD; CpsTtsB = 
Chlamydia psittaci TtsB. 

30 
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Fig. 3. Organization of the Bordetella pertussis pathogenicity island (Pai). Four 
house keeping genes (hatched boxes) and the transposase gene of IS481 (black box) are 
surrounding the Pai. The Pai consists of genes coding for determinants involved in the 
secretory apparatus and its regulation (class I genes, in grey boxes) as well as ORFs 
5 which putitively code for effector proteins (class II genes, in white boxes). Letters 
indicate the respective class I bsc genes whereas numbers correspond to the class II 
ORFs listed in Table 3. 

Fig. 4. PileUp figure from the deduced amino acid sequences homologous to 
10 Yersinia YscU. Abbreviations: BbuFlhB = Borrelia burgdorferi FlhB; TpaFlhB = 
Treponema pallidum FlhB; EcoFlhB - Escherichia coli FlhB; StyFlhB = Salmonella 
typhimurium FlhB; PmiFlhBpart = partial Proteus mirabilis FlhB; YenFlhB = Yersinia 
enterocolitica FlhB; BsuFlhB = Bacillus subtilis FlhB; HpyFlhB = Helicobacter pylori 
FlhB; AtuFlhB = Agrobacterium tumefaciens FlhB; CcrPodW = Caulobacter crescentus 
15 PodW; SflSpa40 = Shigella flexneri Spa40; StySpaS = Salmonella typhimurium SpaS; 
EcoEscU = Escherichia coli EscU; StySsaU = Salmonella typhimurium SsaU; BpeBscU 
= Bordetella pertussis BscU; Yen YscU = Yersinia enterocolitica YscU; RsoHrpN = 
Ralstonia solanacearum HrpN; XcaOrfDpart = partial Xanthomonas campestris OrfO; 
EamHrcU = Erwinia amylovora FtrcU; EheHrcUpart - partial Erwihia herbicola HrcU; 
20 PsyHrpY = Pseudomonas syringae HrpY; CpsOrfl = Chlamydia psittaci Orfl . 

Fig. 5. The DNA sequence of the Bordatella pertussis genome comprising the 
type III secretion system pathogenicity island. Reference should be made to tables 2, 3, 
and 4 and Fig. 3 for information regarding open reading frames. 

25 

Fig. 6. Purification of MBP-Orf2, -4, -6 and -10 by affinity chromatography. 
The ultracentrifugatrion supematants of each lysate (left part of the panels) and the 
products eluated from the affinity column (right part of the panels) were analysed by 
SDS-PAGE and revealed by Coomassie blue staining. 

30 
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DESCRIPTION OF THE INVENTION 

Type III secretion systems identified to date are encoded by either chromosal or 
5 plasmidic pathogenicity island genes. However, no where in the prior art was it realised 
that the conservation of genes encoding class I components of type III secretion systems 
and the clustering of these genes with effector protein coding sequences offered the 
opportunity for detecting unidentified target proteins involved in host colonisation. Such 
proteins would be potentially valuable in both vaccinal and diagnostic fields. 

10 

Although the known sequence of a gene encoding any conserved (class I) type III 
secretion machinery protein can be used in performing this invention, the IcrD gene is 
preferred. The chosen gene will act as a target for detecting unidentified pathogenicity 
islands in related bacterial species. The IcrD gene from Yersinia is preferred as it codes 

1 5 for the archetype of the recently identified LcrD/FlbF family of proteins. Members of 
this family are involved in host cell invasion, virulence in several phytopathogenic 
bacteria or in flagellar assembly. IcrD is preferred because the LcrD protein, and 
consequently the gene encoding it, is one of the most conserved determinants of the 
secretion machinary. Additionally, multiple amino acid comparisons have shown that the 

20 classification of the LcrD family members can be split into two main subfamilies, which, 
interestingly, can be correlated with the functions assigned to these proteins of each 
subfamily. One subfamily encompasses all the motility-involved proteins, while the 
other encompasses all the virulence-related determinants. This observation is illustrated 
in Fig. 2 (and mentioned in Gyri et aL (1995) & Bogdanove et al. (1996)). Thus, if an 

25 unknown IcrD homologous gene is identified, it may, after being routinely sequenced, be 
classified as a virulence or a flagellar gene. Once the pathogenicity island is identified, 
this simple test would therefore define whether the search for other virulence genes on 
the pathogenicity island should be initiated. 
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The preferred method for identifying unknown pathogenicity islands comprising 
a type III secretion system is by: 

i) identifying two highly conserved regions of the target protein sequence (preferably 
of LcrD). Preferably, both regions should contain conserved amino acids which are 

5 encoded by the fewest number of codon possibilities e.g. Methionine (ATG being 

the only possibility) or Tryptophan (TGG being the only possibility). This 
minimises the number of permutations in both degenerate primer sets that are 
designed in the next stage of the process, thus ensuring a greater probability that 
each primer set will specifically anneal to the unknown /crD-equivalent gene 
10 (thereby minimising background non-specific interactions). Most preferably, regions 

should also be chosen that are clearly distinguishable from the paralogue JlhA 
flagellar genes, present in all flagellated bacterial strains. 

ii) designing a degenerate set of primers for both of the chosen regions such that a) the 
primers are at least 15 bases long, preferably 20-30 bases long, and still more 

15 preferably 21-23 bases long, b) they are degenerate at bases that can be more than 

one type of nucleotide whilst still encoding the same amino acid (due to the 
degeneracy of codon usage for amino acids), but no more degenerate than is 
required to cover all permutations for the amino acid region selected, and c) the 
primer set that encodes the more N-terminal region of the chosen protein should 

20 correspond to the coding strand of its corresponding double-stranded DNA 

sequence, and the set that encodes the more C-terminal region should correspond to 
the complementary strand of the corresponding double-stranded DNA sequence. 

iii) synthesising the degenerate primer sets of step ii) using conventional DNA synthesis 
methods well known in the art. 

25 iv) purifying the primer sets of step iii) 

v) adding both the primer sets and a sample containing nucleic acid from a bacterial 
strain (preferably a cell sample of the bacterial species itself) together in appropriate 
quantities and in an appropriate buffer in order to perform a polymerase chain 
reaction (PCR) 



9 



WO 00/37493 



PCT/EP99/10297 



vi) performing a PCR reaction in order to amplify the region of the gene between the 
two primers (conditions for performing the PCR reaction can be optimised using 
techniques well known in the art) 

vii) observing the reaction products on a gel (preferably an agarose gel) for an amplified 
5 product of the size expected; if no such product is present, the bacterial strain is 

unlikely to use a type III secretion system; if such a product is present, the bacterial 
strain is likely to have a type III secretion system, and is likely to be pathogenic. 



The preferred method for confirming that the amplified product actually 
10 corresponds to a virulence gene is by carrying out steps i)-vii) above (where the target 
protein is LcrD) and then: 

viii) optionally separating the product of correct size from any background products of 
incorrect size by removing the correct band from the gel, purifying the product by 
conventional means, and amplifying the product once more with the two degenerate 

15 primer sets in another PCR reaction (under preferably more stringent PCR 

conditions) [this step is required should the product of step vii) not be pure enough 
for direct cloning] 

ix) inserting the DNA fragment by conventional means into a vector which is capable of 
being sequenced, and sequencing the fragment 

20 x) comparing the deduced amino acid sequence of ix) with that of known members of 
the LcrD/FlbF family of proteins to associate the amplified product as being part of 
either a virulence or a flagellar gene. 



And optionally: 

25 xi) using the internal sequence of the fragment to design primers that are the exact 
sequence of, and specific to, the unknown Icr D-equivalent gene. 

xii) using the primers of xi) firstly to screen a genomic library of the organism for 
positive clones 

xiii) isolating the clones of xii), and sequence one or more of said clones 
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xiv) scanning the sequence of one clone (and overlapping sequences of other clones) to 
search for an open reading frame which is approximately the same size as IcrD 
(approximately 2100bp), and encodes a protein homologous to LcrD 

xv) ascertaining whether the LcrD-equivalent protein is more homologous with theJlbF 
5 (flagellar protein secretion) gene family or the IcrD (type III secretion system 

pathogenicity island) gene family. 

The preferred method for characterising the whole pathogenicity island and 
defining unidentified virulence effector genes is by carrying out steps i)-xv) above 
10 (where the target protein is LcrD) and then: 

xvi) if the sequence is more homologous with the IcrD gene family, designing primers at 
either extreme of the gene sequence already ascertained, and scanning and 
sequencing the genomic library (using a standard chromosome walking strategy - 
where the insert boundaries of an original clone serves as a probe for screening and 

15 cloning adjacent regions) to sequence eventually the whole of the pathogenicity 

island (both boundaries of which will be defined by the presence of either direct or 
inverted repeats, or insertion sequences, or the presence of house-keeping genes) 

xvii) defining unidentified virulence effector genes within the sequenced pathogenicity 
island 

20 xviii)cloning, expressing and characterising the virulence genes of xvii) which encode 
virulence effector proteins of the organism 



Definitions 

25 "Bordetella pathogenicity proteins" refers generally to polypeptides having the 

amino acid sequence encoded by the genes defined in tables 2 and 3, or an allelic variant 
thereof. These proteins are: BcrD, BcrH, BscC, BscD, BscE, BscF, BscI, BscJ, BscK, 
BscL, BscN, BscO, BscP, BscQ, BscR, BscS, BscT, BscU, BscV, BrpL, BopN, Orfl, 
Or£2, Orf3, Orf4, Orf5, Orf6, Orf7, Orf8, Orf9, OrflO, Orfll, Orfl 2, Orfl3, Orfl 4, 

30 Orfl5. 
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"Bordetella pathogenicity genes" refers to polynucleotides having the nucleotide 
sequence defined in tables 2 and 3, or allelic variants thereof and/or their complements. 
These genes are: bcrD, bcrH f bscQ bscD, bscE, bscF, bscl bscJ t bscK, bscL, bscN, 
5 bscO, bscP, bscQ, bscR y bscS, bscT, bscU t bscV, brpL, bopN, orfl, orf2, orf3, or/4, or/5 t 
orf6, or/7, orJS, or/9, orflO, orfll, orfl 2, orfl 3, orfl 4, orfl 5. 

"Polypeptide" refers to any peptide or protein comprising two or more amino 
acids joined to each other by peptide bonds or modified peptide bonds, i.e., peptide 

1 0 isosteres. "Polypeptide" refers to both short chains, commonly referred to as peptides, 
oligopeptides or oligomers, and to longer chains, generally referred to as proteins. 
Polypeptides may contain amino acids other than the 20 gene-encoded amino acids. 
"Polypeptides" include amino acid sequences modified either by natural processes, such 
as posttranslational processing, or by chemical modification techniques which are well 

15 known in the art. Such modifications are well described in basic texts and in more 
detailed monographs, as well as in a voluminous research literature. Modifications can 
occur anywhere in a polypeptide, including the peptide backbone, the amino acid side- 
chains and the amino or carboxyl termini. It will be appreciated that the same type of 
modification may be present in the same or varying degrees at several sites in a given 

20 polypeptide. Also, a given polypeptide may contain many types of modifications. 
Polypeptides may be branched as a result of ubiquitination, and they may be cyclic, with 
or without branching. Cyclic, branched and branched cyclic polypeptides may result 
from posttranslational natural processes or may be made by synthetic methods. 
Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent 

25 attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 
nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, 
covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond 
formation, demethylation, formation of covalent cross-links, formation of cystine, 
formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI 

30 anchor formation, hydroxy lation, iodination, methylation, myristoylation, oxidation, 
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proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, 
sulfation, transfer-RNA mediated addition of amino acids to proteins such as 
arginylation, and ubiquitination. See, for instance, PROTEINS - STRUCTURE AND 
MOLECULAR PROPERTIES, 2nd Ed., T. E. Creighton, W. H. Freeman and 
5 Company, New York, 1993 and Wold, F., Postradiational Protein Modifications: 
Perspectives and Prospects, pgs. 1-12 in POSTTRANSLATIONAL COVALENT 
MODIFICATION OF PROTEINS, B. C. Johnson, Ed., Academic Press, New York, 
1983; Seifter et al. 9 "Analysis for protein modifications and nonprotein cofactors", Meth 
Enzymol (1990) 182:626-646 and Rattan et al 9 "Protein Synthesis: Postradiational 
10 Modifications and Aging", Ann NY Acad Sci (1992) 663:48-62. 

"Polynucleotide" generally refers to any polyribonucleotide or 
polydeoxribonucleotide, which may be unmodified RNA or DNA or modified RNA or 
DNA. "Polynucleotides" include, without limitation single- and double-stranded DNA, 

15 DNA that is a mixture of single- and double-stranded regions, single- and double- 
stranded RNA, and RNA that is mixture of single- and double-stranded regions, hybrid 
molecules comprising DNA and RNA that may be single-stranded or, more typically, 
double-stranded or a mixture of single- and double-stranded regions. In addition, 
"polynucleotide" refers to triple-stranded regions comprising RNA or DNA or both RNA 

20 and DNA. The term polynucleotide also includes DNAs or RNAs containing one or 
more modified bases and DNAs or RNAs with backbones modified for stability or for 
other reasons. "Modified" bases include, for example, tritylated bases and unusual bases 
such as inosine. A variety of modifications has been made to DNA and RNA; thus, 
"polynucleotide" embraces chemically, enzymatically or metabolically modified forms 

25 of polynucleotides as typically found in nature, as well as the chemical forms of DNA 
and RNA characteristic of viruses and cells. "Polynucleotide" also embraces relatively 
short polynucleotides, often referred to as oligonucleotides. 

"Variant" as the term is used herein, is a polynucleotide or polypeptide that 
30 differs from a reference polynucleotide or polypeptide respectively, but retains essential 
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properties. A typical variant of a polynucleotide differs in nucleotide sequence from 
another, reference polynucleotide. Changes in the nucleotide sequence of the variant 
may or may not alter the amino acid sequence of a polypeptide encoded by the reference 
polynucleotide. Nucleotide changes may result in amino acid substitutions, additions, 
5 deletions, fusions and truncations in the polypeptide encoded by the reference sequence, 
as discussed below. A typical variant of a polypeptide differs in amino acid sequence 
from another, reference polypeptide. Generally, differences are limited so that the 
sequences of the reference polypeptide and the variant are closely similar overall and, in 
many regions, identical. A variant and reference polypeptide may differ in amino acid 

10 sequence by one or more substitutions (preferably conservative), additions, deletions in 
any combination. A substituted or inserted amino acid residue may or may not be one 
encoded by the genetic code. A variant of a polynucleotide or polypeptide may be a 
naturally occurring such as an allelic variant, or it may be a variant that is not known to 
occur naturally. Non-naturally occurring variants of polynucleotides and polypeptides 

15 may be made by mutagenesis techniques or by direct synthesis. Variants should retain 
one or more of the biological activities of the reference polypeptide. For instance, they 
should have similar (preferably the same) antigenic or immunogenic activities as the 
reference polypeptide. Antigenicity can be tested using standard immunoblot 
experiments, preferably using polyclonal sera against the reference polypeptide. The 

20 immunogenicity can best be tested by measuring antibody responses (using polyclonal 
sera generated against the variant polypeptide) against purified reference polypeptide in a 
standard ELISA test. Preferably, a variant would retain all of the above biological 
activities. 

25 "Identity" is a measure of the identity of nucleotide sequences or amino acid 

sequences. In general, the sequences are aligned so that the highest order match is 
obtained. "Identity" per se has an art-recognized meaning and can be calculated using 
published techniques. See, e.g.: (COMPUTATIONAL MOLECULAR BIOLOGY, 
Lesk, A.M., ed., Oxford University Press, New York, 1988; BIOCOMPUTING: 

30 INFORMATICS AND GENOME PROJECTS, Smith, D.W., ed., Academic Press, New 
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York, 1993; COMPUTER ANALYSIS OF SEQUENCE DATA, PART I, Griffin, A.M., 
and Griffin, H.G., eds., Humana Press, New Jersey, 1994; SEQUENCE ANALYSIS IN 
MOLECULAR BIOLOGY, von Heijne, G„ Academic Press, 1987; and SEQUENCE 
ANALYSIS PRIMER, Gribskov, M. and Devereux, J., eds., M Stockton Press, New 
5 York, 1991). While there exist a number of methods to measure identity between two 
polynucleotide or polypeptide sequences, the term "identity" is well known to skilled 
artisans (Carillo, H., and Lipton, D., SIAM J Applied Math (1988) 48:1073). Methods 
commonly employed to determine identity or similarity between two sequences include, 
but are not limited to, those disclosed in Guide to Huge Computers, Martin J. Bishop, 

10 ed., Academic Press, San Diego, 1994, and Carillo, H., and Lipton, D., SIAM J Applied 
Math (1988) 48:1073. Methods to determine identity and similarity are codified in 
computer programs. Preferred computer program methods to determine identity and 
similarity between two sequences include, but are not limited to, GCG program package 
(Devereux, J., et al, Nucleic Acids Research (1984) 12(1):387), BLASTP, BLASTN, 

15 FASTA (Atschul, S.F. et aL, J Molec Biol (1990) 215:403). Most preferably, the 
program used to determine identity levels was the GCG 9 package, as was used in the 
Examples below. 

As an illustration, by a polynucleotide having a nucleotide sequence having at 
20 least, for example, 95% "identity" to a reference nucleotide sequence is intended that the 
nucleotide sequence of the polynucleotide is identical to the reference sequence except 
that the polynucleotide sequence may include on average up to five point mutations per 
each 100 nucleotides of the reference nucleotide sequence. In other words, to obtain a 
polynucleotide having a nucleotide sequence at least 95% identical to a reference 
25 nucleotide sequence, up to 5% of the nucleotides in the reference sequence may be 
deleted or substituted with another nucleotide, or a number of nucleotides up to 5% of 
the total nucleotides in the reference sequence may be inserted into the reference 
sequence. These mutations of the reference sequence may occur at the 5' or 3' terminal 
positions of the reference nucleotide sequence or anywhere between those terminal 
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positions, interspersed either individually among nucleotides in the reference sequence or 
in one or more contiguous groups within the reference sequence. 

5 Polypeptides of the invention 

In one aspect, the present invention relates to Bordetella pathogenicity proteins (or 
polypeptides). The Bordetella pathogenicity polypeptides include the polypeptides 
encoded by the genes defined in tables 2 and 3; as well as polypeptides comprising the 
amino acid sequence encoded by the genes defined in tables 2 and 3 ; and polypeptides 
10 comprising the amino acid sequence which have at least 75% identity to that encoded by 
the genes defined in tables 2 and 3 over their entire length, and preferably at least 80% 
identity, and more preferably at least 90% identity. Those with 95-99% identity are 
highly preferred. 

15 The Bordetella pathogenicity polypeptides (or fragments thereof) may be in the 

form of the "mature'* protein or may be a part of a larger protein such as a fusion protein. 
It may be advantageous to include an additional amino acid sequence which contains 
secretory or leader sequences, pro-sequences, sequences which aid in purification such as 
multiple histidine residues or Maltose Binding Protein (MBP), or an additional sequence 

20 for stability during recombinant production. Furthermore, addition of exogenous 
polypeptide or lipid tail or polynucleotide sequences to increase the immunogenic 
potential of the final molecule is also considered. 

Fragments of the Bordetella pathogenicity polypeptides are also included in the 
25 invention. A fragment is a polypeptide having an amino acid sequence that is the same as 
part, but not all, of the amino acid sequence of the aforementioned Bordetella pathogenicity 
polypeptides. As with Bordetella pathogenicity polypeptides, fragments may be "free- 
standing," or comprised within a larger polypeptide of which they form a part or region, 
most preferably as a single continuous region. Representative examples of polypeptide 
30 fragments of the invention, include, for example, fragments from about amino acid number 
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1-20, 21-40, 41-60, 61-80, 81-100, and 101 to the end of Bordetella pathogenicity 
polypeptide. In this context "about" includes the particularly recited ranges larger or 
smaller by several, 5, 4, 3, 2 or 1 amino acid at either extreme or at both extremes. The 
fragments should comprise at least 7 consecutive amino acids from the sequences e.g. 8, 
5 10, 12, 14, 18, 20 or more depending on the particular sequence). Preferably the 
fragments comprise an epitope from the sequence. 

Preferred fragments include, for example, truncation polypeptides having the amino 
acid sequence of Bordetella pathogenicity polypeptides, except for deletion of a continuous 

10 series of residues that includes the amino terminus, or a continuous series of residues that 
includes the carboxyl terminus and/or transmembrane region or deletion of two continuous 
series of residues, one including the amino terminus and one including the carboxyl 
terminus. Also preferred are fragments characterized by structural or functional attributes 
such as fragments that comprise alpha-helix and alpha-helix forming regions, beta-sheet 

15 and beta-sheet-forming regions, turn and Uim-forming regions, coil and coil-forming 
regions, hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta 
amphipathic regions, flexible regions, surface-forming regions, substrate binding region, 
and high antigenic index regions. Other preferred fragments are biologically active 
fragments. Biologically active fragments are those that mediate Bordetella pathogenicity 

20 protein activity, including those with a similar activity or an improved activity, or with a 
decreased undesirable activity. Also included are those that are antigenic or immunogenic 
in an animal, especially in a human. 

Preferably, all of these polypeptide fragments retain the biological activity (for 
25 instance antigenic or immunogenic) of the Bordetella pathogenicity protein, including 
antigenic activity. Variants of the defined sequence and fragments also form part of the 
present invention. Preferred variants are those that vary from the referents by conservative 
amino acid substitutions i.e., those that substitute a residue with another of like 
characteristics. Typical such substitutions are among Ala, Val, Leu and lie; among Ser and 
30 Thr; among the acidic residues Asp and Glu; among Asn and Gin; and among the basic 
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residues Lys and Arg; or aromatic residues Phe and Tyr. Particularly preferred are variants 
in which several, 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any 
combination. Most preferred variants are naturally occurring allelic variants of Bordetella 
pathogenicity polypeptide present in strains of Bordetella pertussis. 

The proteins may be chemically conjugated, or expressed as recombinant fusion 
proteins allowing increased levels to be produced in an expression system as compared to 
non-ftised protein. The fusion partner may assist in providing T helper epitopes 
(immunological fusion partner), preferably T helper epitopes recognised by humans, or 
assist in expressing the protein (expression enhancer) at higher yields than the native 
recombinant protein. Preferably the fusion partner will be both an immunological fusion 
partner and expression enhancing partner. 

The Bordetella pathogenicity polypeptides of the invention can be prepared in any 
suitable manner. Such polypeptides include isolated naturally occurring polypeptides, 
recombinantly produced polypeptides, synthetically produced polypeptides, or polypeptides 
produced by a combination of these methods. Means for preparing such polypeptides are 
well understood in the art. 

It is most preferred that a polypeptide of the invention is derived from Bordetella 
pertussis, however, it may preferably be obtained from other organisms of the same 
taxonomic genus. A polypeptide of the invention may also be obtained, for example, from 
organisms of the same taxonomic family or order, such as Bordetella parapertussis or 
Bordetella bronchiseptica. 

A further aspect of the invention is substantially purified Bordetella pathogenicity 
polypeptides of the invention, "substantially purified" when used in reference to a protein 
or peptide means that the molecule has been largely, but not necessarily wholly, 
separated an purified from other cellular and non-cellular components. Typically a 
protein is substantially pure when it is at least about 60 % by weight free from other 
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naturally occurring organic molecules. Preferably the purity is at least about 75 %, more 
preferably at least about 90% , and most preferably at least about 99% by weight pure. 



Polynucleotides of the invention 

5 Another aspect of the invention relates to Bordetella pathogenicity polynucleotides. 

Bordetella pathogenicity polynucleotides include isolated polynucleotides which encode 
the Bordetella pathogenicity polypeptides and fragments respectively, and polynucleotides 
closely related thereto or variants thereof. More specifically, Bordetella pathogenicity 
polynucleotides of the invention include a polynucleotide comprising the nucleotide 

1 0 sequence of genes defined in table 2 or 3 , encoding a Bordetella pathogenicity polypeptide. 
Bordetella pathogenicity polynucleotides further include a polynucleotide comprising a 
nucleotide sequence that has at least 75% identity over its entire length to a nucleotide 
sequence encoding the Bordetella pathogenicity polypeptide encoded by the genes defined 
in tables 2 and 3, and a polynucleotide comprising a nucleotide sequence that is at least 

15 75% identical to that of the genes defined in tables 2 and 3. In this regard, 
polynucleotides at least 80% identical are particularly preferred, and those with at least 
90% are especially preferred. Furthermore, those with at least 95% are highly preferred 
and those with at least 98-99% are most highly preferred, with at least 99% being the most 
preferred. Also included under Bordetella pathogenicity polynucleotides is a nucleotide 

20 sequence which has sufficient identity to a nucleotide sequence of a gene defined in 
tables 2 and 3 to hybridize under conditions useable for amplification or for use as a 
probe or marker. The invention also provides polynucleotides which are complementary 
to such Bordetella pathogenicity polynucleotides. 



Using the information provided herein, such as specific Bordetella pathogenicity 
gene and polypeptide sequences, a polynucleotide of the invention encoding a Bordetella 
pathogenicity polypeptide may be obtained using standard cloning and screening methods, 
such as those for cloning and sequencing chromosomal DNA fragments from bacteria 
using Bordetella pertussis cells as starting material, followed by obtaining a full length 
clone. For example, to obtain a polynucleotide sequence of the invention, typically a 
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library of clones of chromosomal DNA of Bordetella pertussis in E.coli or some other 
suitable host is probed with a radiolabeled oligonucleotide, preferably a 17-mer or 
longer, derived from a partial sequence. Clones carrying DNA identical to that of the 
probe can then be distinguished using stringent hybridization conditions. By sequencing 
5 the individual clones thus identified by hybridization with sequencing primers designed 
from the original polypeptide or polynucleotide sequence it is then possible to extend the 
polynucleotide sequence in both directions to determine a full length gene sequence. 
Conveniently, such sequencing is performed, for example, using denatured double 
stranded DNA prepared from a plasmid clone. Suitable techniques are described by 
10 Maniatis, T., Fritsch, E.F. and Sambrook et al., MOLECULAR CLONING, A 
LABORATORY MANUAL, 2nd Ed.; Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York (1989). (see in particular Screening By Hybridization 1.90 and 
Sequencing Denatured Double-Stranded DNA Templates 13.70). Direct genomic DNA 
sequencing may also be performed to obtain a full length gene sequence. 

15 

A polynucleotide encoding a polypeptide of the present invention, including 
homologs and orthologs from species other than Bordetella pertussis, may be obtained by a 
process which comprises the steps of screening an appropriate library under stringent 
hybridization conditions (for example, using a temperature in the range of 45 - 65°C and an 
20 SDS concentration from 0.1 - 1%) with a labeled or detectable probe consisting of or 
comprising a sequence defined in table 2 or 3 or a fragment thereof; and isolating a full- 
length gene and/or genomic clones containing said polynucleotide sequence. 

The invention also provides a polynucleotide consisting of or comprising a 
25 polynucleotide sequence obtained by screening an appropriate library containing the 
complete gene for a polynucleotide sequence defined in tables 2 and 3 under stringent 
hybridization conditions with a probe having the sequence of said polynucleotide 
sequence defined in table 2 or 3 or a fragment thereof; and isolating said polynucleotide 
sequence. Fragments useful for obtaining such a polynucleotide include, for example, 
30 probes and primers are described elsewhere herein. 
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The nucleotide sequence encoding Bordetella pathogenicity polypeptide encoded 
by the genes defined in tables 2 and 3 may be identical to the polypeptide encoding 
sequence contained in the genes defined in tables 2 or 3, or it may be a sequence, which as 
5 a result of the redundancy (degeneracy) of the genetic code, also encodes the polypeptide 
encoded by the genes defined in tables 2 and 3 respectively. 

When the polynucleotides of the invention are used for the recombinant 
production of Bordetella pathogenicity polypeptide, the polynucleotide may include the 

10 coding sequence for the mature polypeptide or a fragment thereof, by itself; the coding 
sequence for the mature polypeptide or fragment in reading frame with other coding 
sequences, such as those encoding a leader or secretory sequence, a pre-, or pro- or prepro- 
protein sequence, or other fusion peptide portions. For example, a marker sequence which 
facilitates purification of the fused polypeptide can be encoded. In certain preferred 

15 embodiments of this aspect of the invention, the marker sequence is a hexa-histidine 
peptide, as provided in the pQE vector (Qiagen, Inc.) and described in Gentz et al. 7 Proc 
Natl AcadSci USA (1989) 86:821-824, or is an HA tag, or is glutathione-s-transferase, or is 
MBP. The polynucleotide may also contain non-coding 5 7 and 3* sequences, such as 
transcribed, non-translated sequences, splicing and polyadenylation signals, ribosome 

20 binding sites and sequences that stabilize mRNA. 

Nucleic acid comprising fragments of the sequences of the invention are also 
provided. These should comprise at least 10 consecutive nucleotides from the sequences 
(e.g. 12, 14, 15, 18, 20, 25, 30, 35, 40 or more depending on the particular sequence). 
25 Such fragments can preferably hybridise to the above-mentioned sequences under 
stringent conditions. 

Further preferred embodiments are polynucleotides encoding Bordetella 
pathogenicity protein variants comprising the amino acid sequence of the Bordetella 
30 pathogenicity polypeptide encoded by the genes defined by tables 2 and 3 respectively in 
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which several, 10-25, 5-10, 1-5, 1-3, 1-2 or 1 amino acid residues are substituted, deleted or 
added, in any combination. Most preferred variant polynucleotides are those naturally 
occurring Bordetella pertussis sequences that encode allelic variants of the Bordetella 
pathogenicity proteins in Bordetella strains, preferably B. pertussis. 

The present invention further relates to polynucleotides that hybridize to the herein 
above-described sequences. In this regard, the present invention especially relates to 
polynucleotides which hybridize under stringent conditions to the herein above-described 
polynucleotides. As herein used, the term "stringent conditions" means hybridization will 
occur only if there is at least 80%, and preferably at least 90%, and more preferably at least 
95%, yet even more preferably 97-99% identity between the sequences. 

Polynucleotides of the invention, which are identical or sufficiently identical to a 
nucleotide sequence of any gene defined in tables 2 and 3 or a fragment thereof, may be 
used as hybridization probes for cDNA and genomic DNA, to isolate full-length cDNAs 
and genomic clones encoding Bordetella pathogenicity polypeptides respectively and to 
isolate cDNA and genomic clones of other genes (including genes encoding homologs and 
orthologs from species other than Bordetella pertussis) that have a high sequence similarity 
to the Bordetella pathogenicity genes. Such hybridization techniques are known to those of 
skill in the art. Typically these nucleotide sequences are 80% identical, preferably 90% 
identical, more preferably 95% identical to that of the referent. The probes generally will 
comprise at least 15 nucleotides. Preferably, such probes will have at least 30 nucleotides 
and may have at least 50 nucleotides. Particularly preferred probes will range between 30 
and 50 nucleotides. In one embodiment, to obtain a polynucleotide encoding Bordetella 
pathogenicity polypeptide, including homologs and orthologs from species other than 
Bordetella pertussis, comprises the steps of screening an appropriate library under stringent 
hybridization conditions with a labeled probe having a nucleotide sequence contained in 
one of the gene sequences defined by tables 2 and 3, or a fragment thereof; and isolating 
full-length cDNA and genomic clones containing said polynucleotide sequence. Thus in 
another aspect, Bordetella pathogenicity polynucleotides of the present invention further 
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include a nucleotide sequence comprising a nucleotide sequence that hybridize under 
stringent condition to a nucleotide sequence having a nucleotide sequence contained in one 
of the genes defined by table 2 and 3, or a fragment thereof. Also included with Bordetella 
pathogenicity polypeptides are polypeptides comprising amino acid sequences encoded by 
5 nucleotide sequences obtained by the above hybridization conditions. Such hybridization 
techniques are well known to those of skill in the art. Stringent hybridization conditions 
are as defined above or, alternatively, conditions under overnight incubation at 42°C in a 
solution comprising: 50% formamide, 5xSSC (150mM NaCl, 15mM trisodium citrate), 50 
mM sodium phosphate (pH7.6), 5x Denhardt's solution, 10 % dextran sulfate, and 20 
10 microgram/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 
O.lxSSC at about 65°C. 

A coding region of a Bordetella pathogenicity gene may be isolated by screening 
using a DNA sequence defined in table 2 or 3 to synthesize an oligonucleotide probe. A 
1 5 labeled oligonucleotide having a sequence complementary to that of a gene of the invention 
is then used to screen a library of cDNA, genomic DNA or mRNA to determine which 
members of the library the probe hybridizes to. 

There are several methods available and well known to those skilled in the art to 
20 obtain full-length DNAs, or extend short DNAs, for example those based on the method of 
Rapid Amplification of cDN A ends (RACE) (see, for example, Frohman, et al. , PNAS USA 
85: 8998-9002, 1988). Recent modifications of the technique, exemplified by the 
Marathon™ technology (Clontech Laboratories Inc.) for example, have significantly 
simplified the search for longer cDNAs. In the Marathon™ technology, cDNAs have been 
25 prepared from mRNA extracted from a chosen tissue and an 'adaptor' sequence ligated onto 
each end. Nucleic acid amplification (PCR) is then carried out to amplify the "missing" 5' 
end of the DNA using a combination of gene specific and adaptor specific oligonucleotide 
primers. The PCR reaction is then repeated using "nested" primers, that is, primers 
designed to anneal within the amplified product (typically an adaptor specific primer that 
30 anneals further 3' in the adaptor sequence and a gene specific primer that anneals further 5' 
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in the selected gene sequence). The products of this reaction can then be analyzed by DNA 
sequencing and a full-length DNA constructed either by joining the product directly to the 
existing DNA to give a complete sequence, or carrying out a separate full-length PCR 
using the new sequence information for the design of the 5' primer. 

5 

The polynucleotides of the invention that are oligonucleotides derived from a 
sequence defined in table 2 or 3 may be used in the processes herein as described, but 
preferably for PCR, to determine whether or not the polynucleotides identified herein in 
whole or in part are transcribed in bacteria in infected tissue. It is recognized that such 
10 sequences will also have utility in diagnosis of the stage of infection and type of infection 
the pathogen has attained. 



The polynucleotides and polypeptides of the present invention may be employed as 
research reagents and materials for discovery of treatments and diagnostics to animal and 
15 human disease. 



Diagnostic Assays 

This invention also relates to the use of Bordetella pathogenicity polypeptides, or 
Bordetella pathogenicity polynucleotides, for use as diagnostic reagents. Detection of 
20 Bordetella pathogenicity polypeptides will provide a diagnostic tool that can add to or 
define a diagnosis of B. pertussis disease, among others. 

Materials for diagnosis may be obtained from a subject's cells, such as from blood, 
urine, saliva, tissue biopsy. 

25 

Thus in another aspect, the present invention relates to a diagonostic kit for a 
disease or suspectability to a disease, particularly B. pertussis disease, which comprises: 
(a) a Bordetella pathogenicity polynucleotide, preferably the nucleotide sequence of one 
of the gene sequences defined by tables 2 and 3, or a fragment thereof; 
30 (b) a nucleotide sequence complementary to that of (a); 



24 



WO 00/37493 



PCT/EP99/10297 



(c) a Bordetella pathogenicity polypeptide, preferably the polypeptide encoded by one of 
the gene sequences defined in tables 2 and 3, or a fragment thereof; 

(d) an antibody to a Bordetella pathogenicity polypeptide, preferably to the polypeptide 
encoded by one of the gene sequences defined in tables 2 and 3; or 

5 (e) a phage displaying an antibody to a Bordetella pathogenicity polypeptide, preferably 
to the polypeptide encoded by one of the gene sequences defined in tables 2 and 3. 

It will be appreciated that in any such kit, (a), (b), (c), (d) or (e) may comprise a 
substantial component. 

10 

Polypeptides and polynucleotides for prognosis, diagnosis or other analysis may be 
obtained from a putatively infected and/or infected individual's bodily materials. 
Polynucleotides from any of these sources, particularly DNA or RNA, may be used directly 
for detection or may be amplified enzymatically by using PCR or any other amplification 

1 5 technique prior to analysis. RNA, particularly mRNA, cDN A and genomic DNA may also 
be used in the same ways. Using amplification, characterization of the species and strain of 
infectious or resident organism present in an individual, may be made by an analysis of the 
genotype of a selected polynucleotide of the organism. Deletions and insertions can be 
detected by a change in size of the amplified product in comparison to a genotype of a 

20 reference sequence selected from a related organism, preferably a different species of the 
same genus or a different strain of the same species. Point mutations can be identified by 
hybridizing amplified DNA to labeled Bordetella pathogenicity polynucleotide sequences. 
Perfectly or significantly matched sequences can be distinguished from imperfectly or more 
significantly mismatched duplexes by DNase or RNase digestion, for DNA or RNA 

25 respectively, or by detecting differences in melting temperatures or renaturation kinetics. 
Polynucleotide sequence differences may also be detected by alterations in the 
electrophoretic mobility of polynucleotide fragments in gels as compared to a reference 
sequence. This may be carried out with or without denaturing agents. Polynucleotide 
differences may also be detected by direct DNA or RNA sequencing. See, for example, 

30 Myers et al, Science, 230: 1242 (1985). Sequence changes at specific locations also may be 
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revealed by nuclease protection assays, such as RNase, VI and SI protection assay or a 
chemical cleavage method. See, for example, Cotton et al. y Proc. Natl Acad. Sci t USA, 85: 
4397-4401 (1985). 

5 This invention also relates to the use of polynucleotides of the present invention as 

diagnostic reagents. Detection of a mutated form of a polynucleotide of the invention, which 
is associated with a disease or pathogenicity will provide a diagnostic tool that can add to, or 
define, a diagnosis of a disease, a prognosis of a course of disease, a determination of a stage 
of disease, or a susceptibility to a disease, which results from under-expression, over- 
10 expression or altered expression of the polynucleotide. Organisms, particularly infectious 
organisms, carrying mutations in such polynucleotide may be detected at the polynucleotide 
level by a variety of techniques, such as those described elsewhere herein. 

The invention further provides a process for diagnosing disease, preferably bacterial 
15 (particularly Bordetella) infections, more preferably infections caused by Bordetella 
pertussis, comprising determining from a sample derived from an individual, such as a 
bodily material, an increased level of expression of polynucleotide having a sequence 
defined in table 2 or 3, Increased or decreased expression of a polynucleotide can be 
measured using any on of the methods well known in the art for the quantitation of 
20 polynucleotides, such as, for example, amplification, PCR, RT-PCR, RNase protection, 
Northern blotting, spectrometry and other hybridization methods. 

Vectors, Host Cells, Expression Systems 

The invention also relates to vectors that comprise a polynucleotide or 
25 polynucleotides of the invention, host cells that are genetically engineered with vectors of the 
invention and the production of polypeptides of the invention by recombinant techniques. 
Cell-free translation systems can also be employed to produce such proteins using RNAs 
derived from the DNA constructs of the invention. 
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Recombinant polypeptides of the present invention may be prepared by processes 
well known in those skilled in the art from genetically engineered host cells comprising 
expression systems. Accordingly, in a further aspect, the present invention relates to 
expression systems that comprise a polynucleotide or polynucleotides of the present 
5 invention, to host cells which are genetically engineered with such expression systems, and to 
the production of polypeptides of the invention by recombinant techniques. 

For recombinant production of the polypeptides of the invention, host cells can be 
genetically engineered to incorporate expression systems or portions thereof or 

10 polynucleotides of the invention. Introduction of a polynucleotide into the host cell can be 
effected by methods described in many standard laboratory manuals, such as Davis, et al , 
BASIC METHODS IN MOLECULAR BIOLOGY, (1986) and Sambrook, et aL 9 
MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed., Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, N.Y. (1989), such as, calcium phosphate transfection, 

15 DEAE-dextran mediated transfection, transvection, microinjection, cationic lipid-mediated 
transfection, electroporation, transduction, scrape loading, ballistic introduction and infection. 

Representative examples of appropriate hosts include bacterial cells, such as cells of 
streptococci, staphylococci, enterococci, E. coli, streptomyces, cyanobacteria, 'Bacillus 
20 subtilis, Moraxella catarrhalis, Haemophilus influenzae and Neisseria meningitidis', fungal 
cells, such as cells of a yeast, Kluveromyces, Saccharomyces, a basidiomycete, Candida 
albicans and Aspergillus; insect cells such as cells of Drosophila S2 and Spodoptera Sf9; 
animal cells such as CHO, COS, HeLa, C127, 3T3, BHK, 293, CV-1 and Bowes melanoma 
cells; and plant cells, such as cells of a gymnosperm or angiosperm. 

25 

A great variety of expression systems can be used to produce the polypeptides of the 
invention. Such vectors include, among others, chromosomal-, episomal- and virus-derived 
vectors, for example, vectors derived from bacterial plasmids, from bacteriophage, from 
transposons, from yeast episomes, from insertion elements, from yeast chromosomal 
30 elements, from viruses such as baculoviruses, papova viruses, such as SV40, vaccinia 
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viruses, adenoviruses, fowl pox viruses, pseudorabies viruses, picornaviruses, retroviruses, 
and alphaviruses and vectors derived from combinations thereof, such as those derived from 
plasmid and bacteriophage genetic elements, such as cosmids and phagemids. The 
expression system constructs may contain control regions that regulate as well as engender 
5 expression. Generally, any system or vector suitable to maintain, propagate or express 
polynucleotides and/or to express a polypeptide in a host may be used for expression in this 
regard. The appropriate DNA sequence may be inserted into the expression system by any of 
a variety of well-known and routine techniques, such as, for example, those set forth in 
Sambrook et ah, MOLECULAR CLONING, A LABORATORY MANUAL, {supra), 

10 

In recombinant expression systems in eukaryotes, for secretion of a translated protein 
into the lumen of the endoplasmic reticulum, into the periplasmic space or into the 
extracellular environment, appropriate secretion signals may be incorporated into the 
expressed polypeptide. These signals may be endogenous to the polypeptide or they may be 
1 5 heterologous signals. 

Polypeptides of the present invention can be recovered and purified from 
recombinant cell cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose 
20 chromatography, hydrophobic interaction chromatography, affinity chromatography, 
hydroxylapatite chromatography and lectin chromatography. Most preferably, ion metal 
affinity chromatography (IMAC) is employed for purification. Well known techniques for 
refolding proteins may be employed to regenerate active conformation when the 
polypeptide is denatured during intracellular synthesis, isolation and or purification. 

25 

The expression system may also be a recombinant live microorganism, such as a 
virus or bacterium. The gene of interest can be inserted into the genome of a live 
recombinant virus or bacterium. Inoculation and in vivo infection with this live vector 
will lead to in vivo expression of the antigen and induction of immune responses. 
30 Viruses and bacteria used for this purpose are for instance: poxviruses (e.g; vaccinia, 
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fowlpox, canarypox), alphaviruses (Sindbis virus, Semliki Forest Virus, Venezuelan 
Equine Encephalitis Virus), adenoviruses, adeno-associated virus, picornaviruses 
(poliovirus, rhinovirus), herpesviruses (varicella zoster virus, etc), Listeria, Salmonella , 
Shigella, Neisseria, BCG. These viruses and bacteria can be virulent, or attenuated in 
5 various ways in order to obtain live vaccines. Such live vaccines also form part of the 
invention. 

Antibodies 

According to a further aspect, the invention provides antibodies which bind 
10 specifically to the polypeptides of the invention. These may be polyclonal or monoclonal 
and may be produced by any suitable means well known to a skilled person in the art. 

Typically, a mouse or rat is immunised with a protein (preferably adjuvanted with 
Freund's complete adjuvant) and injected (doses of 50-200 jag/injection is typically 

15 sufficient). Polyclonal antibodies can be isolated by bleeding the animal to extract serum. 
Alternatively, monoclonal antibodies can be generated by removing the spleen (or large 
lymph nodes) and dissociating it into single cells (Kohler and Milstein, (1975) Nature, 
256:495-497). These are then induced to fuse with myeloma cells to form hybridoma, 
and are cultured in a selective medium (eg hypoxanthine, aminopterin, thymidine 

20 merium, "HAT"). The resulting hybridomas are plated by limiting dilution, and are 
assayed for the production of antibodies which bind specifically to the immunizing 
antigen (and which do not bind to unrelated antigens). The selected monoclonal-secreting 
hybridomas are then cultured either in vitro (eg in tissue culture bottles or hollow fiber 
reactors), or in vivo (as Ascites in mice). 

25 

Techniques for the production of single chain antibodies (U.S. Patent No. 4,946,778) 
can be adapted to produce single chain antibodies to polypeptides or polynucleotides of this 
invention. Also, transgenic mice, or other organisms or animals, such as other mammals, 
may be used to express humanized antibodies immunospecific to the polypeptides or 
30 polynucleotides of the invention. 
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Alternatively, phage display technology may be utilized to select antibody genes 
with binding activities towards a polypeptide of the invention either from repertoires of 
PCR amplified v-genes of lymphocytes from humans screened for possessing anti- 
Bordetella pathogenicity polypeptide or from naive libraries (McCafferty, et aL, (1990), 
Nature 348, 552-554; Marks, et al 9 (1992) Biotechnology 10, 779-783). The affinity of 
these antibodies can also be improved by, for example, chain shuffling (Clackson et al 9 
(1991) Nature 352: 628). 



The above-described antibodies may be employed to isolate or to identify clones 
expressing the polypeptides or polynucleotides of the invention to purify the polypeptides or 
polynucleotides by, for example, affinity chromatography. 



Antibodies against a Bordetella pathogenicity polypeptide or polynucleotide may be 
employed to treat infections, particularly bacterial infections. 



Polypeptide variants include antigenically, epitopically or immunologically 
equivalent variants form a particular aspect of this invention. 

Preferably, the antibody or variant thereof is modified to make it less immunogenic 
in the individual. For example, if the individual is human the antibody may most 
preferably be "humanized," where the complimentarity determining region or regions of 
the hybridoma-derived antibody has been transplanted into a human monoclonal antibody, 
for example as described in Jones et al (1986), Nature 321, 522-525 or Tempest et aL 9 
(1991) Biotechnology 9, 266-273. 

Vaccines 

Another aspect of the invention relates to a method for inducing an 
immunological response in a mammal which comprises inoculating the mammal with 
Bordetella pathogenicity polypeptide or epitope-bearing fragments, analogs, outer- 
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membrane vesicles or cells (attenuated or otherwise) adequate to produce antibody and/or 
T cell immune response to protect said animal from Bordetella (particularly B. pertussis) 
disease, among others. Such agents may be used alone, or conjugated to another molecule 
which improves its immunological potency. In particular the invention relates to the use 

5 of Bordetella pathogenicity polypeptides encoded by the genes defined in table 3 - the 
effector proteins. Yet another aspect of the invention relates to a method of inducing 
immunological response in a mammal which comprises, delivering Bordetella 
pathogenicity polypeptide via a vector directing expression of Bordetella pathogenicity 
polynucleotide in vivo in order to induce such an immunological response to produce 

0 antibody to protect said animal from diseases. 



A further aspect of the invention relates to an immunological composition or 
vaccine formulation which, when introduced into a mammalian host, induces an 
immunological response in that mammal to a Bordetella pathogenicity polypeptide 

15 (particularly one encoded by a gene defined in table 3) wherein the composition 
comprises a Bordetella pathogenicity gene, or Bordetella pathogenicity polypeptide or 
epitope-bearing fragments, analogs, outer-membrane vesicles or cells (attenuated or 
otherwise). The vaccine formulation may further comprise a suitable carrier. The 
Bordetella pathogenicity polypeptide vaccine composition is preferably administered 

20 orally or parenterally (including subcutaneous, intramuscular, intravenous, intradermal 
etc. injection). Formulations suitable for parenteral administration include aqueous and 
non-aqueous sterile injection solutions which may contain anti-oxidants, buffers, 
bacteriostats and solutes which render the formulation isotonic with the blood of the 
recipient; and aqueous and non-aqueous sterile suspensions which may include 

25 suspending agents or thickening agents. The formulations may be presented in unit-dose 
or multi-dose containers, for example, sealed ampoules and vials and may be stored in a 
freeze-dried condition requiring only the addition of the sterile liquid carrier immediately 
prior to use. The vaccine formulation may also include adjuvant systems for enhancing 
the immunogenicity of the formulation, such as oil-in water systems and other systems 
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known in the art. The dosage will depend on the specific activity of the vaccine and can 
be readily determined by routine experimentation. 

The vaccine formulations of the invention may also comprise other Bordetella 
5 antigens known to be suitable vaccinal agents, for instance: pertussis toxoid, pertactin, 
agglutinogins 1 and 2, FHA (filamentous hemagglutinin), and adenylate cyclase / 
haemolysin (AC/HLY), or immunogenic fragments thereof (Locht et aL, NAR (1986) 
14:3251-3261; Relman et aL, PNAS USA (1989) 86:2637-2641; Roberts et aL, MoL 
Microbiol. (1991) 5:1393-1404; Mooi et aL, Microb. Pathog. (1992) 12:127-135; 
10 Hewlett and Gordon, In Pathogenesis and Immunity in Pertussis (1988), New York, 
Wiley & Sons, pp. 193-209. 



Yet another aspect of the invention relates to an immunological/vaccine 
formulation which comprises the polynucleotide of the invention. Such techniques are 
1 5 known in the art, see for example Wolff et aL, Science, (1990) 247: 1465-8. 



Vaccine compositions can comprise polypeptides, antibodies, or polynucleotides 
of the invention. The pharmaceutical compositions will comprise a therapeutically 
effective amount of either polypeptides, antibodies, or polynucleotides of the claimed 
20 invention. 



The term "therapeutically effective amount" as used herein refers to an amount of 
a therapeutic agent to treat, ameliorate, or prevent a desired disease or condition (in this 
case Bordetella, particularly B. pertussis, disease), or to exhibit a detectable therepeutic 
25 or preventative effect. The effect can be detected by, for example, antigen levels. 
Therapeutic effects also include reduction in physical symptoms, such as decreased body 
temperature. Immunogenic compositions used as vaccines comprise an immunologically 
effective amount of the antigenic or immunogenic polypeptides. By "immunologically 
effective amount", it is meant that the administration of that amount to an individual, 
either in a single dose or as part of a series, is effective for treatment or prevention. 



30 
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EXAMPLES 

The examples below are carried out using standard techniques, which are well 
known and routine to those of skill in the art, except where otherwise described in detail. 
5 The examples illustrate, but do not limit the invention. 

Example 1 : A type III secretion system is present in a pathogenicity island in Bordetella 
pertussis. 

The presence of a IcrD homologous gene in the Bordetella pertussis genome was 
10 investigated by polymerase chain reaction (PCR). The primers used (oligos 95080 and 
95081 shown in Table 1) were degenerate oligonucleotides corresponding to highly 
conserved regions of the amino acids sequences of the LcrD/FlbF family of proteins. 
These primers were also designed to favour the amplification of virulence genes instead 
of their paralogue fihA or jlbF flagellar genes, present in flagellated bacterial strains. The 
15 presence of the 3' triplet CAT in oligonucleotide 95081 is a determinant - indeed when 
multiple sequence analysis is done using known homologous sequences (database 
searching was done with either the FASTA and TFASTA programs of the GCG9 
package, or with BLASTN, BLASTP and BLASTX programs, and alignments were 
carried out with the PILEUP program from the GCG9 package) it could be seen that the 
20 CAT triplet codes for a methionine which is exclusively present in virulence sequences 
while absent in the flagellar ones. 

When analysed on agarose gel, the PCR product appeared as a heterogeneous mix of 
fragments, one of which was presenting the expected size (around 150 bp). A second 
25 round of amplification using the approximately 1 50 bp DNA as template yielded a single 
amplicon which was cloned in pCRII (obtained from Invitrogen) for further 
characterisation. It appeared as a 152 bp fragment whose nucleotide sequence (Fig. 1), 
although similar to all IcrD/flbF homologous genes, shares a higher level of identity with 
the virulence (/crD-like) genes. 

30 
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Table 1. 



oligonucleotides 


sequence 1 


features 


IcrD corresponding 
codons 2 


95080 


GSH ATH rrw nnH a a r n a p a Tr 


direct, 
degenerate 


150 to 156 


95081 


GC RTC DCC YTT DAC RAA YTT CAT 


complement, 
degenerate 


193 to 200 


95363 


CC ATC GAC GCG GAC TTG CGC G 


direct, non- 
degenerate 


157 to 164 


95364 


CGC GCC GTC CAT GGC GCC ATA 


complement, non- 
degenerate 


186 to 192 


96110 

i 


C CGA CGC CGA CGC CGT ACG GTC 


direct, non- 
degenerate 


172 to 179 



1 The letter code for nucleotide ambiguity proposed by IUB (Nomenclature Committee, 
5 1985, Eur. J, Biochem., 150: 1-5) was used. 

2 The DNA sequence of the IcrD gene from Yersinia enterocolitica used for this work 
was published by Piano et al (1991). 

To ensure that the cloned fragment was actually a B. pertussis sequence PCR was 
0 performed under stringent conditions with serial 10-fold dilutions of DNA from B. 
pertussis. The optimisation of stringent PCR conditions require a perfect match between 
template and primers. It was likely, however, that due to the degeneration of the original 
primers, the 152 bp sequence initially obtained had, at its boundaries, a few base pair 
differences with the actual B. pertussis IcrD-like (hereafter called bcrD) sequence. A 
5 nested PCR approach using internal primers (oligos 95363 and 95364 Table 1) was 
therefore preferred, as primers known to be the correct B. pertussis sequence are used. A 
dose-response-relationship was observed between the 10-fold dilutions of 5. pertussis 
template DNA and the product of the nested PCR, suggesting that the 152 bp amplicon 
actually originates from the Bordetella genome. 

0 

Comparison of the 152 bp sequence with IcrD/flbF genes allowed us to define a 
specific DNA stretch (oligo 961 10 in Table 1) which was used as a probe for screening a 
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genomic library of B. pertussis constructed in the plasmid vector pBR327 (Delisse- 
Gathoye et al, 1990, Infect-Immun. 58: 2895-905). Several positive clones were isolated 
and restriction analysis of their resident plasmids showed that they harboured 
overlapping inserts. The entire nucleotide sequence of one insert was determined, 
5 revealing a large open reading frame (ORF). This 2100 bp ORF encoded a 75 kDa 
polypeptide which is 59 % and 47 % identical to the yersinial proteins LcrD and FlhA 
respectively. Multiple amino acids comparisons of all known members of the LcrD/FlbF 
family of proteins, including the B. pertussis BcrD deduced amino acid sequence, 
showed that this sequence clearly ranked within the virulence associated determinants 
10 (Fig. 2). These data strongly suggest that B. pertussis possesses a type III export system, 
involved in the secretion of virulence effectors. 

The B. pertussis /crD-like nucleotide sequence (bcrD) has been submitted to 
EMBL and assigned the accession number Y 13383. 

15 

This general technique has been useful for determining the presence/absence of a 
type III secretion system in other bacterial strains. The human pathogens Borrelia 
burgdorferi and Helicobacter pylori were intensively screened for such a system using 
this technique. No evidence for a type III secretion system could be found. The 

20 subsequent publication of the genome sequences of these microorganisms has confirmed 
the absence of similar systems in these species. In contrast, the method allowed the 
amplification of a DNA fragment from the phytopathogen Pseudomonas corrugata, 
which clearly ranks among the virulence sequences. This technique could be applied to 
any Gram negative pathogen of medical or agronomic importance such as Neisseria spp, 

25 Moraxella catharalis, Vibrio cholerae, any Enterobacteriaceae, Pseudomonas spp, 
Haemophilus influenzae, Brucella spp, Francisella tularensis, Pasteurella spp, 
Legionella pneumophila. Even in strains that have been fully sequenced, this technique 
can be used as a simple method for checking alternate types or strains of the same 
species. For instance, some types of pathogenic Escherichia coli harbour a type III 

30 secretion system whereas others do not. 
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Example 2: Analysis of the B. pertussis bcrD flanking sequences to characterise the 
pathogenicity island and virulence-related proteins encoded therein 

The tendency for systematic clustering of type III encoding genes inside 
5 pathogenicity islands prompted the analysis of 5. pertussis bcrD flanking sequences. 
The whole region containing the pathogenicity island was sequenced by chromosome 
walking taking care to pay attention to the fact that each Pathogenicity island region must 
be represented in at least two independent clones, to avoid possible artefacts due to 
chimeric DNA inserts. This revealed clustered ORFs that could be classed in 3 

10 categories: class I type ORFs (table 2); class II type ORFs (table 3) - the effector proteins 
which have the best vaccinal and diagnostic properties; & insertion sequences, and ORFs 
homologous to house keeping genes of other species (table 4). Although there is no 
general rule for defining the boundaries of a Pathogenicity island, they can be 
demarcated with a direct or inverse repeat at one or other boundary, however the absolute 

1 5 demarcation of the boundaries can only really be done by the detection of house keeping 
genes at the extremes of the sequence. In the present case, an insertion sequence (IS in 
Fig. 3) was present at the 5' end of the island (separating the virulence ORFs from the 
house keeping genes), but absent at the 3' end. In addition, the presence of house 
keeping genes (greA and ICFG-like) surrounding a locus which, according to sequence 

20 data, encompasses numerous virulence sequences is a good indication of the boundaries 
of the island. The complete gene organisation of the pathogenicity island is schematically 
represented in Figure 3. The precise definition of the PAI boundaries requires further 
experimental data, such as the characterisation of the corresponding chromosomal region 
of a Bordetella strain which is devoid of a type HI secretion system. 

25 
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Table 2 



names 


Coding sequence 
from/to (with 
reference to Fig. 5) 


Coding 
DNA strand 


SEQ ID 
NO: 


Homologous genes (from 
Yersinia, unless 
otherwise specified) 


Class I genes, i.e. genes coding for determinants involved in the secretory apparatus 
and their regulation 


bcrD 


8656/10755 


complement 


1 


LcrD 


bcrH 


14097/14582 


direct 


3 


lcrH( = sycD) 


bscC 


26955/28757 


direct 


5 


YscC 


bscD 


7379/8659 


complement 


7 


YscD 


bscE 


7039/7338 


complement 


9 


None 


bscF 


6783/7049 


complement 


11 


YscF 


bscl 


17892/18218 


direct 


13 


YscI 


bscJ 


18215/19039 


direct 


15 


YscJ 


bscK 


19032/19694 


direct 


17 


None 


bscL 


19664/20302 


direct 


19 


YscL 


bscN 


20307/21641 


direct 


21 


YscN 


bscO 


21641/22150 


direct 


23 


YscO 


bscP 


22147/22695 


direct 


25 


None 


bscQ 


22692/23771 


direct 


27 


YscQ 


bscR 


23768/24439 


direct 


29 


YscR 


bscS 


24445/24711 


direct 


31 


YscS 


bscT 


24723/25523 


direct 


jj 


YscT 


bscU 


25520/26569 


direct 


35 


YscU 


bscV 


26566/26964 


direct 


37 


None 


brpL 


28778/29380 


complement 


39 


hrpL 

(Pseudomonas syringae) 



37 



WO 00/37493 



PCT/EP99/10297 



Table 3 



Names 


Coding sequence 
from/to (with 
reference to Fig. 5) 


Coding 
DNA strand 


SEQ ID 
NO: 


Homologous genes (from 
Yersinia, unless otherwise 
specified) 


Class II ( 


DRFs which putatively code for effector proteins 




bopN 


1 1906/13003 


r nm d 1 r*m F*n t 




lopJv ( — IcrJb) 


orfl 


6160/6747 


direct 




None 


orfl 


10752/11120 


complement 


45 


None 


orf3 


11117/11527 


complement 


47 


None 


orf4 


11532/11909 


complement 


49 


None 


or/5 


13002/13784 


direct 


51 


None 


orf6 


13806/14081 


direct 


53 


None 


or/7 


14630/15571 


direct 


55 


None 


or/8 


15601/16803 


direct 


57 


None 


or/9 


16827/17288 


direct 


59 


BcrH 


orflO 


17293/17814 


direct 


61 


pcr4 

(Pseudomonas aeruginosa) 


orfll 


29412/29591 


complement 


63 


None 


orfll 


29555/30529 


complement 


65 


None 


orfl 3 


30631/31776 


direct 


67 


None 


orfl 4 


31773/33005 


complement 


69 


None 


orf!5 


32370/33014 


direct 


71 


None 
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Table 4 



No name 
specified 


Coding sequence 
from/to (with 
reference to Fig. 5) 


Coding 
DNA strand 


SEQ ID 
NO: 


Homolgous sequences 


Insertion Sequences and house keeping genes 




711/2024 


direct 


73 


uracil permease genes of 
numerous bacteria 




2055/3590 


complement 


75 


Chemoreceptor genes of 
numerous bacteria 




4220/4696 


direct 


77 


greA {Escherichia coli) 




4998/5948 


complement 


79 


transposase genes of 
numerous bacteria 




33002/34852 


complement 


81 


ICFG gene {Synechocystis sp) 



Next to the bcrD gene, there is an open reading frame (ORF) whose deduced 
5 amino acid sequence shares significant similarities with the YscU protein of Yersinia spp 
(39% identity and 51% similarity) and other known YscU homologs (Fig. 4). YscU, like 
LcrD, is a component of the Yersinia type III secretion machinery involved in the 
virulence mechanisms of the bacteria. B. pertussis therefore possesses a classical type III 
secretion system which is most probably involved in pathogenicity. This latter point can 
10 be investigated through phentoypic analyses of mutants (see below). 

The total length of the Pai is approximately 30 to 40 kb. The DNA sequence of 
the whole region is presented in Figure 5, and is referred to in tables 2, 3, and 4. 
Restriction analysis on pulsed-field gel electrophoresis allowed the type III locus to be 
15 mapped at coordinate position 1,590 kb on the Tohama I strain chromosome. 

No homologies could be found between the B. pertussis Class II Pai DNA 
sequences and the sequences reported in the GenEMBL databases (except for those 
stated in table 3). The expressed products of these unknown genes within the Pai 
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responsible for virulence, will be useful in the development of a vaccine formulation 
against pathogenic Bordetella pertussis. 

To address the precise function of the Pai, a bcrD mutant was engineered by 
allelic exchange. In the resulting mutant, the bcrD gene was disrupted by an aphA-3 
cassette conferring kanamycin resistance. This cassette was inserted in such a way that 
translation was not interrupted, avoiding any polar effect on expression of putative 
downstream cistrons. A mutant has been isolated and its associated phenotype is being 
currently analysed. 

Example 3: Analysis of the in situ expression of the genes of the pathogenicity island 

Genetic constructions 

To produce a mutant defective in type III secretion, a 255-bp fragment (codons 
363 to 445) was deleted from the bcrD coding sequence and replaced by a cassette 
containing the aphA-3 gene which confers kanamycin resistance (Menard et aL, J. 
Bacteriol. (1993) 175:5899-5906). The aphA-3 cassette was excised from pUC18K by 
EcoKl-Pstl digestion and introduced in the bcrD £c<?RI-Sy<?8387I sites. This construct 
generated an early stop in bcrD translation and allowed in-frame translation of the 
remaining 3' end of the mutated gene, avoiding possible polar effects on expression of 
downstream cistrons. The mutated bcrD gene with its flanking sequences, was excised 
by BgHl-Notl cutting and subsequently inserted into the Xbal-EcoRl sites of the suicide 
plasmid pSS1129 (Stibitz, Methods Enzymol. (1994) 235:458-465), thanks to DNA 
adaptators. The resulting construction was named pAF214. pAF248 is a derivative of 
pAF214 that contained two additional unique Spel and Pad sites. These sites, included 
in a pair of complementary oligonucleotides, were introduced into the BamHl site of 
pAF214. Other constructs included pAF245 and pAF246. PCR amplification of a 831 
bp fragment covering the 5' region and the 4 first codons of bcrD was generated. This 
amplicon was further introduced into BamHI-HinDlll linearized pNM480 (Minton, Gene 
(1984) 31:269-273), in such a way that the bcrD initiation codon was placed in frame 
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with lacZ, used as a reporter gene. The resulting construct was named pAF245. 
Similarly, primers were designed for placing lacZ downstream of a 849 bp fragment that 
encompassed upstream bscN sequences including its 3 first codons, pAF246 was 
obtained by cloning this fragment in pNM480. 

5 

Transformations and allelic exchanges 

B. pertussis cells, from a freshly saturated culture in 10 ml of SS medium, were 
washed and resuspended in 100^1 of a cold 10% (v/v) glycerol solution. Up to 10 ^ig of 
supercoiled purified DNA in a maximum of 20 |il of water were added to 100 \xl of the 
10 bacterial suspension. Cells and DNA were transferred to a prechilled 0.2 cm 
electroporation cuvette (Bio-Rad) and placed in a Gene Pulser apparatus (Bio-Rad). 
Pulses were achieved with settings of 25 [iF, 2.5 kV, and 600 Q, giving a time constant 
ranging from 1 1 to 14 ms. 

15 After their initial isolation on BG plus gentamycin, pAF214 and pAF248 

transformants that undergone a second recombination step were selected on streptomycin 
as described (Stibitz, supra). The null bcrD mutants were finally distinguished from 
revertants by their acquired resistance to kanamycin. The proper integration of the aphA- 
3 was assessed by southern blot analysis. In contrast, introduction of pAF245 and 

20 pAF246 only required a single crossover selected on BG plus ampicillin. This 
recombination step led to the placement of the lacZ coding sequence under the control of 
the signals governing the transcription of bcrD and bscN respectively. 

Mice model 

25 After a two days growing on BG agar plates, wild type and mutant bacteria were 

recovered and resuspended in PBS at a concentration of 10 s PFU ml* 1 . 25 \x\ of the 
suspension were injected in each nostril of pentobarbital anaesthetized mice. Lungs 
colonization was assayed after 4 h, 3, 7, 14, 26, 39 and 45 days by treating both lungs of 
each mouse in an Ultraturax grinder and titrating the resuspended bacteria on BG agar 

30 plates. 
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j3-galactosidase assay 

0.5 ml of bacterial suspensions coming from liquid cultures grown to log phase 
(OD = 0.2), were assayed as described previously (Miller, (1972) "Experiments in 
5 molecular genetics." Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.). We 
used the chromogenic substrate o-nitrophenyl-/?-D-galactoside (ONPG) of Sigma. 

Transcription of both bcrD and bscN transcripts appear controlled by the bvg locus 

Most of the Bordetella virulence functions are controlled by the bvg locus. The 

10 Bvg + phase is characterized by the expression of virulence factors and is necessary for 
colonization of animal models. In contrast, the bacteria are avirulent in Bvg" phase 
which can be induced by nicotinic acid or MgS0 4 . We investigated the level of 
expression of two genes that belonged to distinct unit of transcription, i.e. bcrD and 
bscN, by using transcriptional fusions of lacZ into these genes. To this end, we isolated 

15 the mutants NIVh86 and NIVh87, which integrated pAF245 and pAF246 respectively. 
In the former mutant, a single recombination step led to the setting lacZ in place of the 
bcrD coding sequence, whereas in the latter, lacZ replaced bscN. The level of expression 
of both bcrD and bscN transcripts was assessed either in Bvg + or in Bvg" phases. Both B. 
pertussis genes were weakly expressed in vitro. Additionally, however, these levels of 

20 expression appeared to be clearly modulated by the Bvg system. Indeed, whereas P- 
galactosidase could be assayed in Bvg' conditions, no enzyme activity was detected in 
Bvg" phase (table 5). 



25 



42 



WO 00/37493 



PCT/EP99/10297 



Table 5. p-galactosidase activity, in Miller units (Miller, supra), when lacZ is placed 
under the control of that direct the exrpession of bcrD or bscN. 



phase 

transcript 


Bvg + 


Bvg 


bcrD 


3.54 


0.02 


bscN 


1.65 


0.04 



5 



Example 4: Recombinant expression of effector protein vaccine candidates 

In the discovered sequence, seven ORFs (prf2 to -8) particularly fulfil certain 
criteria that make them good candidates as effector proteins and vaccine candidates. 

10 First, they appear surrounded by typical type III secretion (class I) genes, and therefore 
incontestably belong to the type III secretion locus. Furthermore, they don't display 
significant similarities with genes present in related type III systems from other 
organisms, and are therefore likely to be effector proteins specific for Bordetella. In 
addition to these ORFs bopN, or/9 and or/10 are also of particular interest as vaccine 

15 candidates. Despite the fact that these sequences do not fulfil the second criterium above 
(they have some similarity to popN, pcrH and pcr4 of Pseudomonas aeruginosa), these 
products may also be exported by the specialized translocon. For these reasons, ten 
ORFs, i.e. orf2 to -JO and bopN, were selected for further analysis. To this end, ten pairs 
of primers (table 6) were designed for amplifying their corresponding ORF. The 

20 amplified ORFs were then cloned in the pCR-TOPO® T/A cloning system (Invitrogen) 
and their sequences were checked for errors putatively induced by the Taq DNA 
polymerase. Correct inserts were retrieved by EcoRI and BamHI (or Bglil - see table 6) 
cutting and transferred into the pMAL® vectors (New England Biolabs; Maina et al. 7 
Gene (1988) 74:365-373), opened by EcoRI and BamHI restriction. In these vectors, 

25 expression of the cloned inserts yields recombinant proteins fused to the maltose binding 
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protein (MBP) of E. colL The MBP domain of the fusion protein provides a means for 
both detecting the expressed product and purifying it by affinity chromatography. 

Four ORFs, namely or/2, -4 and -10 on the one hand, and or/6 on the other, have 
5 been cloned into pMAL-c2E ® and pMAL-p2E® respectively. Transformed bacteria, 
grown in 300 ml of culture medium, were induced with IPTG (300 (iM) and lysed in a 
French pressure cell. Insoluble material was pelleted by ultracentrifugation and 
discarded whereas the resulting supernatant was applied to an amylose resin. Fusion 
proteins that specifically bind to the amylose through their MBP domain, were further 
10 eluted by application of maltose 10 mM. This method allowed us to recover from 10 to 
50 mg of each fusion protein (Fig. 6). The expressed Bordetella products may be 
separated from the MBP by utilising the enterokinase cleavage site between the 
Bordetella polypeptide and the MBP. The other ORFs should be expressable using a 
similar approach. 

15 

The secreted proteins will be analysed using standard techniques to confirm their 
functional and immunological properties. First, the immunogenicity of the secreted 
proteins will be assessed by investigating the presence of antibodies directed against 
these proteins in the serum of infected patients. In addition, their putative recognition as 

20 protective antigens will be based on challenge experiments, realized in a mouse model. 
Second, the biological properties of the effector proteins will be assessed by analysing 
their catalytic activities. For instance, it is expected that one of the secreted proteins 
would display a tyrosine phosphatase activity. Finally, the function of the effector 
proteins will be investigated by microinjecting the proteins into the cytoplasm of 

25 eukaryotic cells. This will allow us to display putative activities of inhibition of actin 
polymerisation, cytotoxicity or induction of apoptosis, i.e. those types of activities that 
have been assigned to effector proteins secreted by type III secretion systems discovered 
in other species. 

30 
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Table 6. PCR primers used for amplifying the ORFs encoding vaccine candidates. 



or/2 


direct 


5'-GAG GAA TTC CAT ATG CCC ACC ATG ATG CCG CAT ACC CTA CCC TCG 




complement 


5'-TCT AGA GGA TCC GGC GAA TGG ATT TCT TGC TCG TCA 


or/3 


direct 


5'-GAG GAA TTC CAT ATG CCC ACC ATG TCC AGC GCC GTA CCC GGC 




complement 


5'-TCT AGA GGA TCC AGG GTA GGG TAT GCG GC A TCA TCC 


orf4 


direct 


5'-GAG GAA TTC CAT ATG CCC ACC ATG AAT ACT GCC GAT AGG GCG CTG 




complement 


5-TCT AGA GGA TCC GGT ACG GCG CTG GAC ATG GCG TC 


bopN 


direct 


5-GAG GAA TTC CAT ATG CCC ACC ATG ACT CGT ATC GAT GCC GCC 




complement 


5*-TCT AGA GGA TCC GCG CCC TAT CGG CAG TAT TCA TGC 


or/5 


direct 


5'-GAG GAA TTC CAT ATG CCC ACC ATG GGG AGT CCT CGG AGA AGG AA 




complement 


5'-TCT AGA GGA TCC ATA CTC CTT GTG CAG CGC TTA GCG 


orf6 


direct 


5'-GAG GAA TTC CAT ATG CCC ACC ATG CAG GAG CAA GGC ATC CAA TC 




complement 


5'-TCT AGA GGA TCC CAT GGA AGG CCT CCG CGC TCA GAC 


or/7 


direct 


5 -GAG GAA TTC CAT ATG CCC ACC ATG TCT GTT TCT CCG ACT TCG CCC 




complement 


5'-TCT AGA GGA TCC TGA AGG TTG GAG CCG GAC ACT CAG 


orjS 


direct 

complement 


5'-GAG GAA TTC CAT ATG CCC ACC ATG ACC GTC ATG AGT ACG ACC ATA 
5*-TCT AGA TCT TTC CTT GAG CGC CCG GCG CTA CA 


or/9 


direct 


5*-GAG GAA TTC CAT ATG CCC ACC ATG ACT GTT CAT GAC GAC GCG 




complement 


5*-TCT AGA GGA TCC GAG TCT GAG TGC ATG GAG TTA CTC C 


orflO 


direct 


5'-GAG GAA TTC CAT ATG CCC ACC ATG CAC TCA GAC TCA GGT TCA GAT 




complement 


5'-TCT AGA GGA TCC TCG CCG TCA GAT CCA AAT TCA TCC AG 


Initiation and STOP codons of the corresponding ORF are written in bold. The cloning 
sites EcoRl, BamHl or Bglil are underlined. All but one of the complementary primers 
contain a BamHl site. In the case of orjS, as it presents an internal BamHl recognition 
sequence, a BgUl site was preferred. 
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SEQUENCE LISTING 

<110> Alex Bollen 

Alain Fauconnier 
5 Edmond Godfroid 

<120> Vaccine 



10 <130> B45168 

<160> 82 

^ <170> FastSEQ for Windows Version 3.0 



20 



<210> 1 
<211> 2100 
<212> DNA 

<213> Bordetella pertussis 

<220> 

<221> CDS 

<222> CD-.- (2100) 

25 <400> 1 

atg acg age aag aaa tec att cgc cgc ctg caa cgc gcg gtg gcg ctg ' 48 

Met Thr Ser Lys Lys Ser lie Arg Arg Leu Gin Arg Ala Val Ala Leu 
15 10 15 



30 



35 



45 



50 



gec acc age cgc aac gac ate gta ctg gec gtg etc ate gtg gcg ate 96 

Ala Thr Ser Arg Asn Asp He Val Leu Ala Val Leu He Val Ala He ■ 

20 25 30 

gtc ttc atg atg ate ctg ccg ttg ccc aca acg ctg gtc gac gtg ctg 144 

Val Phe Met Met He Leu Pro Leu Pro Thr Thr Leu Val Asp Val Leu 
35 40 45 



ate ggt gcg aac atg acg ctg teg gca gtc ctg ctg atg gtc gcg atg 192 
He Gly Ala Asn Met Thr Leu Ser Ala Val Leu Leu Met Val Ala Met 

55 60 



40 so 



tac ctg cct teg ccc ctg gcg ttt tec teg ttc cct teg gtc ctg ctg 240 

Tyr Leu Pro Ser Pro Leu Ala Phe Ser Ser Phe Pro Ser Val Leu Leu 

65 "70 75 80 

H t( f acg Ctg ttc cgg ctg ggc atc tcc atc <3~9 acc acg egg ctg 288 

Val Thr Thr Leu Phe Arg Leu Gly He Ser He Ala Thr Thr Arg Leu 

85 90 95 



atc ctg ctg caa ggc gat gee ggc cac atc atc gag acc ttc ggc aac 
He Leu Leu Gin Gly Asp Ala Gly His He He Glu Thr Phe Gly Asn 
100 105 HO 



336 



ttc gtg gtg ggc ggc aac ctg atc gtc ggc ctg gtg gtt ttc etc atc 384 

00 Phe Val Val Gly Gly Asn Leu He Val Gly Leu Val Val Phe Leu He 
115 120 125 

etc acg atc gtg cag ttc gtg gtc atc acc aaa ggc gcg gag egg gtg 432 

Leu Thr He Val Gin Phe Val Val He Thr Lys Gly Ala Glu Arg Val 

5U 130 135 140 
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30 



45 



50 



gcc gaa gtc gcc gcg cgc ttc teg ctg gac gec atg ccc ggc aag cag 480 

Ala Glu Val Ala Ala Arg Phe Ser Leu Asp Ala Met Pro Gly Lys Gin 
145 150 155 160 

atg tec ate gac gcg gac ttg cgc gcg ggc acc ata gac atg gac gaa 529 

Met Ser lie Asp Ala Asp Leu Arg Ala Gly Thr lie Asp Met Asp Glu 
165 170 175 

gcc cga cgc cga cgc cgt acg gtc gag aag gaa age caa ctg tat ggc 57 6 

Ala Arg Arg Arg Arg Arg Thr Val Glu Lys Glu Ser Gin Leu Tyr Gly 
180 185 190 



gcc atg gac ggc gcg atg aag ttc gtc aag ggc gat gcc ate gcc ggc 624 
15 Ala Met Asp Gly Ala Met Lys Phe Val Lys Gly Asp Ala lie Ala Gly 

195 200 205 

ctg ate ate gtt gcc gtc aac ctg ctt ggc ggc atg ctg gtc ggc gtg 672 
Leu lie lie Val Ala Val Asn Leu Leu Gly Gly Met Leu Val Gly Val 
20 210 215 220 

ctg cag cgc ggc ctg age gcc ggc gag gcc gtg cag aca tat gcc ate 720 
Leu Gin Arg Gly Leu Ser Ala Gly Glu Ala Val Gin Thr Tyr Ala lie 
25 225 230 235 240 

ctg acc ata ggc gac ggg etc ate gcg cag ate ccg gcg ctg ttc ate 768 
Leu Thr lie Gly Asp Gly Leu He Ala Gin He Pro Ala Leu Phe He 
245 250 255 

gcc ate tgc gcc gga ate ate gtg acg egg gtg cag acc ggg gat ggc 816 
Ala lie Cys Ala Gly He He Val Thr Arg Val Gin Thr Gly Asp Gly 
260 265 270 

ccc tec aac gta ggc acc gac ate ggc gca caa gtg ctg gcg cag cct 864 
35 Pro Ser Asn Val Gly Thr Asp He Gly Ala Gin Val Leu Ala Gin Pro 

275 280 285 

cgc gcc ctg gtc att gcc ggc gcg ate teg gca ggc ctg ggc etc att 912 
Arg Ala Leu Val He Ala Gly Ala He Ser Ala Gly Leu Gly Leu He 
40 290 295 300 



ccc ggc atg ccc acg ctg gtc ttc ttc gcc ctg gcc gcc gcg gtg ggc 960 

Pro Gly Met Pro Thr Leu Val Phe Phe Ala Leu Ala Ala Ala Val Gly 

305 310 315 320 

acc ate ggt ttc gta ctg ctg cgc gca tec cag cgt ccg ccc gaa ggc 1008 

Thr He Gly Phe Val Leu Leu Arg Ala Ser Gin Arg Pro Pro Glu Gly 
325 330 335 

gcc gag ccc gcg etc gcc ggc atg get gcc gac ggc cag ccc cgc acc 1056 

Ala Glu Pro Ala Leu Ala Gly Met ala Ala Asp Gly Gin Pro Arg Thr 
340 345 350 



cgc gcg ccg gcg gat ggg cag gcg gaa ttc gcc ccc acc gtc ccg ctg 1104 

t>5 Arg Ala Pro Ala Asp Gly Gin Ala Glu Phe Ala Pro Thr Val Pro Leu 
355 360 365 

ate ate gac gta gcc gcg egg ctg cag ccc egg ttc gag ccg gcc acc 1152 

He He Asp Val Ala Ala Arg Leu Gin Pro Arg Phe Glu Pro Ala Thr 
60 370 , 375 380 
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etc acc gac gat ctg ctg cag ate egg egg gcg etc tat ttc gac ctg 1200 

Leu Thr Asp Asp Leu Leu Gin lie Arg Arg Ala Leu Tyr Phe Asp Leu 
385 390 395 400 

ggc gtg ccg ttt ccc ggc ate cag ttg cgc ttc acc gaa gcg ctg gee 1248 

Gly Val Pro Phe Pro Gly lie Gin Leu Arg Phe Thr Glu Ala Leu Ala 

405 410 415 

10 gee aat acc tac acc ate gtg ctg teg gag ate ccg gtg gcg caa gga 1296 

Ala Asn Thr Tyr Thr He Val Leu Ser Glu He Pro Val Ala Gin Gly 

420 425 430 

atg ttg cgc gac gat gee gtg ctg gtg egg gac acc gag cag aac ctg 1344 

15 Met Leu Arg Asp Asp Ala Val Leu Val Arg Asp Thr Glu Gin Asn Leu 

435 440 445 

cag gee ctg egg ate gca tac gaa acg ggc gcg gee ttt ctg ccc gat 1392 

Gin Ala Leu Arg lie Ala Tyr Glu Thr Gly Ala Ala Phe Leu Pro Asp 
20 450 455 460 

acg ccc acg ate tgg gtt gcg gee agt ctg acc ggc gee ttg cgc gat 1440 

Thr Pro Thr He Trp Val Ala Ala Ser Leu Thr Gly Ala Leu Arg Asp 
465 470 475 480 

25 

gca ggt att cct tac ctg ggt ate age cag ate ctg act tgg cac ttg 1488 

Ala Gly lie Pro Tyr Leu Gly He Ser Gin He Leu Thr Trp His Leu 

485 490 495 

30 gca tat gta ttg aaa aaa tat tea gee gat ttc ate ggc ate cag gaa 1536 

Ala Tyr Val Leu Lys Lys Tyr Ser Ala Asp Phe He Gly lie Gin Glu 

500 505 510 

acc egg ttt ctg ctt teg gee atg gaa gaa cga ttt ccc gat ctg gtc 1584 

35 Thr Arg Phe Leu Leu Ser Ala Met Glu Glu Arg Phe Pro Asp Leu Val 

515 520 525 

aag gag tgc ctg cgc gtc atg ccg gtg cag aag att gee gaa ate ct'g 1632 

Lys Glu Cys Leu Arg Val Met Pro Val Gin Lys He Ala Glu He Leu 
40 530 535 540 

cag cgc ctt gtt tec gaa gaa gtg teg ata cgc aac ctg cgc gee gtc 1680 

Gin Arg Leu Val Ser Glu Glu Val Ser He Arg Asn Leu Arg Ala Val 
545 550 555 560 

ctg gaa gcg ctg gtc gaa tgg ggc cag aag gaa aag gat acc gtc ctg 1728 

Leu Glu Ala Leu Val Glu Trp Gly Gin Lys Glu Lys Asp Thr Val Leu 

565 570 575 



45 



50 ctt acg gag tat gtc cga ate gca etc aag cgc tat ate age cac aag 1776 

Leu Thr Glu Tyr Val Arg He Ala Leu Lys Arg Tyr He Ser His Lys 

580 585 590 

tac acc age ggc cac aat ate ctg ccc gee tac ctg ctg gee ccc aag 1824 

55 Tyr Thr Ser Gly His Asn He Leu Pro Ala Tyr Leu Leu Ala Pro Lys 

595 600 605 

gtc gag gaa acc gtg cgc gee gee ate egg cag acc gec gec ggc agt 1872 

Val Glu Glu Thr Val Arg Ala Ala lie Arg Gin Thr Ala Ala Gly Ser 

60 610 615 620 
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10 



tat etc gec etc gat ccg gac acg aca cgc cga ctg gtc gag cac ate 1920 
Tyr Leu Ala Leu Asp Pro Asp Thr Thr Arg Arg Leu Val Glu His lie 
625 630 635 640 

cgt caa tgt gtc ggc gat ctg gee gec ggc gcg age cgt ccc gtc ttg 1968 
Arg Gin Cys Val Gly Asp Leu Ala Ala Gly Ala Ser Arg Pro Val Leu 
645 650 655 

ctg acg teg atg gac ate egg cgc tac acg cgc aag atg ata gaa gec 2016 
Leu Thr Ser Met Asp lie Arg Arg Tyr Thr Arg Lys Met lie Glu Ala 
660 665 670 



gat etc tac gec ctg ccg gtg ctg tec tac cag gaa ctg acg ccg gag 2064 
15 Asp Leu Tyr Ala Leu Pro Val Leu Ser Tyr Gin Glu Leu Thr Pro Glu 
675 680 685 

ate aat gta cag ccc ctg ggc agg gtg gat eta tga 2100 
lie Asn Val Gin Pro Leu Gly Arg Val Asp Leu * 
20 690 695 



<210> 2 
<211> 699 
25 <212> PRT 

<213> Borcietella pertussis 



30 



35 



40 



45 



50 



55 



60 





<400> 


2 


























Met 


Thr 


Ser 


Lys 


Lys 


Ser 


He 


Arg 


Arg 


Leu 


Gin 


Arg 


Ala 


Val 


Ala 


Leu 


1 








5 










10 








15 




Ala 


Thr 


Ser 


Arg 
20 


Asn 


Asp 


He 


Val 


Leu 
25 


Ala 


Val 


Leu 


He 


Val 
30 


Ala 


lie 


Val 


Phe 


Met 


Met 


He 


Leu 


Pro 


Leu 


Pro 


Thr 


Thr 


Leu 


Val 


Asp 


Val 


Leu 






35 










40 










45 






lie 


Gly 
50 


Ala 


Asn 


Met 


Thr 


Leu 
55 


Ser 


Ala 


Val 


Leu 


Leu 
60 


Met 


Val 


Ala 


Met 


Tyr 


Leu 


Pro 


Ser 


Pro 


Leu 


Ala 


Phe 


Ser 


Ser 


Phe 


Pro 


Ser 


Val 


Leu 


Leu 


65 










70 










75 










80 


Val 


Thr 


Thr 


Leu 


Phe 
85 


Arg 


Leu 


Gly 


He 


Ser 
90 


lie 


Ala 


Thr 


Thr 


Arg 
95 


Leu 


lie 


Leu 


Leu 


Gin 


Gly 


Asp 


Ala 


Gly 


His 


lie 


lie 


Glu 


Thr 


Phe 


Gly 


Asn 








100 










105 










110 




Phe 


Val 


Val 


Gly 


Gly 


Asn 


Leu 


He 


Val 


Gly 


Leu 


Val 


Val 


Phe 


Leu 


He 






115 










120 








125 








Leu 


Thr 


lie 


Val 


Gin 


Phe 


Val 


Val 


He 


Thr 


Lys 


Gly 


Ala 


Glu 


Arg 


Val 




130 










135 










140 








Ala 


Glu 


Val 


Ala 


Ala 


Arg 


Phe 


Ser 


Leu 


Asp 


Ala 


Met 


Pro 


Gly 


Lys 


Gin 


145 










150 








155 






160 


Met 


Ser 


lie 


Asp 


Ala 


Asp 


Leu 


Arg 


Ala 


Gly 


Thr 


He 


Asp 


Met 


Asp 


Glu 










165 










170 








175 




Ala 


Arg 


Arg 


Arg 


Arg 


Arg 


Thr 


Val 


Glu 


Lys 


Glu 


Ser 


Gin 


Leu 


Tyr 


Gly 








180 










185 










190 


Ala 


Met 


Asp 


Gly 


Ala 


Met 


Lys 


Phe 


Val 


Lys 


Gly 


Asp 


Ala 


He 


Ala 


Gly 






195 










200 








205 






Leu 


lie 


He 


Val 


Ala 


Val 


Asn 


Leu 


Leu 


Gly 


Gly 


Met 


Leu 


Val 


Gly 


Val 




210 










215 










220 








Leu 


Gin 


Arg 


Gly 


Leu 


Ser 


Ala 


Gly 


Glu 


Ala 


Val 


Gin 


Thr 


Tyr 


Ala 


He 


225 










230 










235 








240 


Leu 


Thr 


He 


Gly 


Asp 
245 


Gly 


Leu 


He 


Ala 


Gin 
250 


lie 


Pro 


Ala 


Leu 


Phe 
255 


He 
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Ala 


lie 


Cy s 


Ala 


Gly 


lie 


He 


Val 


Thr 


Ar g 


Val 


*J n 1 


1 I IX 


Gl y 


Asp 


Gly 








*i D U 










? fi R 

d. U J 
















Pro 


Ser 


As n 


Val 


Gly 


Thr 


Asp 


Tip 

x x c; 


oiy 


A 1 a 


ri i n 
0111 


V^i 1 
vdl 


Leu 


r\±- d 


0111 


IT X \J 






27 5 










28 0 


















Ar g 


Ala 


Leu 


Val 


lie 


Ala 


Gly 


Ala 


Tip 

J. 1 c 


JC J. 


ni 

rt.1 d 


u ±y 


Leu 


ui y 


T 

Leu 


T 1 p 
lie 




290 










OQC. 

O 










jUU 










Pro 


Gly 


Met 


Pro 


Thr 


Leu 


Val 


Phe 


Phe 


Ala 


Leu 


A 1 ^ 


A 1 3 
rll ct 


Ala 


Val 


Gly 


JUJ 










~> x u 










"3 1 Q 
Jl J 










O £ u 


1 III. 


T 1 P 
lit. 


Gly 


Phe 


Val 


Leu 


ii e u 


A YTT 

t\l. y 


Ala 
M.1 d 




1 11 


A 

niy 


Pro 


C X O 


r 1 n 
blU 


Pit; 

uiy 










325 




















^ 




A13 




P 

rro 


7\1 a 


Leu 


Ala 
Al a 


biy 


iyje c 


did 


Ala 


Asp 


P 1 T f 

biy 


bin 


Pro 


Arg 


rp 1_ ^ 

1 nr 








J H U 










J 4 3 










0 c n 
OO U 




Arg 




P r-r-i 

L IU 


Al A 
jtVX d 


rto p 


vj i y 


O 1 1 1 


A 1 a 

rid 


r* 1 n 

L3IU 


lr lie 


Ala 

Ai a 


Pro 


1 nr 


v a 1 


Pro 


Leu 






TC.C. 
J J J 










jOU 










JbD 








Tin 

lie 


lie 




1 

V d X 


7\ 1 -a 
ril d 


A 1 a 
rt.1 d 


A T*/~T 

-r-vl y 


Leu 


bin 


Pro 


Arg 


irne 


Pin 

blU 


Pro 


Ala 


1 nr 




Tin 

o t u 




















■5011 










Leu 


T Vi r* 


nop 


Asp 


Leu 


Leu 


o X 1 1 


lie 


Arg 


Arg 


ai a 


Leu 


Tyr 


irne 


Asp 


Leu 












_5 .7 \J 










j 










/inn 
4 (JU 


biy 


Vdl 


tri (J 


ir i it; 


P Vft 
t X (J 




lie 


bin 


Leu 


Arg 


OK a 

trne 


i nr 


blU 


Ala 


Leu 


Ala 




















41U 










4 lb 




i-il a 


Asn 


1 ill 


T 

iyr 


1 11 X 


lie 


v ai 


Leu 


Ser 


pi 
blU 


lie 


Pro 


val 


Ala 


bin 


biy 








"> n 










/IOC 

4 Z. 0 










a 0 n 
4 0 U 
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<213> Bordetella pertussis 

<220> 

<221> CDS 

<222> (1) . . . (486) 

<400> 3 

atg cca aag tea gec gac cag ggc ggc tec ccg gcg tea get teg cat 4 8 

Met Pro Lys Ser Ala Asp Gin Gly Gly Ser Pro Ala Ser Ala Ser His 



10 1 5 



10 15 



gag gcg ttg cgc cat att etc gac gca ggc get teg atg ggg ggc ttg 96 

Glu Ala Leu Arg His lie Leu Asp Ala Gly Ala Ser Met Gly Gly Leu 
20 25 30 

cag ggg ttg gac gag gcg cag cag cag gcg ttg. tac gcg ate ggt cat 144 

Gin Gly Leu Asp Glu Ala Gin Gin Gin Ala Leu Tyr Ala lie Gly His 
35 40 45 

ggc gec tac gaa cag ggg cgc tat gee gac gcg ttg aaa atg ttc tgc 192 

Gly Ala Tyr Glu Gin Gly Arg Tyr Ala Asp Ala Leu Lys Met Phe Cys 
50 55 60 

ctg ctg gtc gcg tgc gat ccg ctg gaa gec cgt tat ctg ctg gec ctg 240 

Leu Leu Val Ala Cys Asp Pro Leu Glu Ala Arg Tyr Leu Leu Ala Leu 
65 70 75 80 



ggc gee gcg gec cag gag ctg ggg ctg tac gag cat gec ttg cag caa 288 
Gly Ala Ala Ala Gin Glu Leu Gly Leu Tyr Glu His Ala Leu Gin Gin 

90 95 



30 85 



tac gcg gec gcg gcg get ttg cag ttg gac tec ccc agg ccc ctg ttg 336 
Tyr Ala Ala Ala Ala Ala Leu Gin Leu Asp Ser Pro Arg Pro Leu Leu 
100 105 110 

cat ggc gec gag tgc ctg tat gcg ttg ggt cgt cgc cgc gac gee ctg 384 
His Gly Ala Glu Cys Leu Tyr Ala Leu Gly Arg Arg Arg Asp Ala Leu 
115 120 125 

gat acg etc gac atg gtg ctt gag ttg tgc ggc teg ccg gag cgt gcg 432 
Asp Thr Leu Asp Met Val Leu Glu Leu Cys Gly Ser Pro Glu Arq Ala 
130 135 140 

gec ctg cgc gaa egg gee gag ttg ctg cgc agg age tat gca cgt gee 480 
Ala Leu Arg Glu Arg Ala Glu Leu Leu Arg Arg Ser Tyr Ala Arg Ala 
145 150 155 



160 



gac tga 
Asp * 



<210> 4 
<211> 161 
55 <212> PRT 

<213> Bordetella pertussis 



60 



Met 



<400> 4 

Pro Lys Ser Ala Asp Gin Gly Gly Ser Pro Ala Ser Ala Ser His 
5 10 15 



486 
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Glu Ala Leu Arg His lie Leu Asp Ala Gly Ala Ser Met Gly Gly Leu 

20 25 30 

Gin Gly Leu Asp Glu Ala Gin Gin Gin Ala Leu Tyr Ala He Gly His 
35 40 45 

5 Gly Ala Tyr Glu Gin Gly Arg Tyr Ala Asp Ala Leu Lys Met Phe Cys 
50 55 60 

Leu Leu Val Ala Cys Asp Pro Leu Glu Ala Arg Tyr Leu Leu Ala Leu 
65 ^0 75 80 

Gly Ala Ala Ala Gin Glu Leu Gly Leu Tyr Glu His Ala Leu Gin Gin 
10 85 90 95 

Tyr Ala Ala Ala Ala Ala Leu Gin Leu Asp Ser Pro Arg Pro Leu Leu 

100 105 HO 

His Gly Ala Glu Cys Leu Tyr Ala Leu Gly Arg Arg Arg Asp Ala Leu 
115 120 125 



15 



20 



30 



35 



40 



45 



55 



60 



Asp Thr Leu Asp Met Val Leu Glu Leu Cys Gly Ser Pro Glu Arq Ala 

130 135 140 

Ala Leu Arg Glu Arg Ala Glu Leu Leu Arg Arg Ser Tyr Ala Arq Ala 

"5 150 155 160 

Asp 



<210> 5 

<211> 1803 

<212> DNA 

25 <213> Bordetella pertussis 



<220> 

<221> CDS 

<222> (1) . . . (1803) 

<400> 5 

atg gca ata ggt egg ctt ggg tat ctt gtc cgc ggc gca tgg gec ggg 48 

Met ala lie Gly Arg Leu Gly Tyr Leu Val Arg Gly Ala Trp Ala Gly 
15 10 15 

ggt gtc atg ctg ttg gcg gec ggt age gec tgg gcg gcg ccg aac tgg 96 
Gly Val Met Leu Leu Ala Ala Gly Ser Ala Trp Ala Ala Pro Asn Trp 
20 25 30 

cct ttg gcg ccg tat age tac tac gcg cag cag cag age ctg tec gat 144 
Pro Leu Ala Pro Tyr Ser Tyr Tyr Ala Gin Gin Gin Ser Leu Ser Asp 
35 40 45 



gtg ctg cgc gag ttc gec gca ggc ttc age ctg gcg ttg caa cag ggc 192 
Val Leu Arg Glu Phe Ala Ala Gly Phe Ser Leu Ala Leu Gin Gin Gly 
50 55 60 

aaa ggg gtg caa ggc gtg gtc aat ggg cgt ttc aat gcg cgc aca ccc 24 0 

ca L ¥f Gly Val Gln Giy Val Val Asn G1 V Ar <3 Asn Ala Arg Thr Pro 

50 65 70 75 80 



acg gag ttc ate gag cgt etc age ggc ate tat ggg ttc aac tgg ttc 288 

Thr Glu Phe He Glu Arg Leu Ser Gly He Tyr Gly Phe Asn Trp Phe 
85 90 95 

gtg cat gec ggc acg ctg tat gtc age cgc ace age gac gtg gtt acc 336 

Val Hxs Ala Gly Thr Leu Tyr Val Ser Arg Thr Ser Asp Val Val Thr 

100 105 HO 

cgc gcg gtg gat gca gec ggc get teg ccg teg gcg ttg cgc cag gec 384 
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Arg Ala Val Asp Ala Ala Gly Ala Ser Pro Ser Ala Leu Arq Gin Ala 
H5 120 125 

ttg ctg caa ctg ggc ate ctg gac gaa cgc ttc gga tgg gga gag ctg 

Leu Leu Gin Leu Gly lie Leu Asp Glu Arg Phe Gly Trp Gly Glu Leu 
130 135 140 



10 145 150 



50 305 



atg ate gtg gac gtc aat acc gat ctg gtc aac gag ctg ggt gtc ace 
Met He Val Asp Val Asn Thr Asp Leu Val Asn Glu Leu Gly Val Thr 

310 315 320 



432 



ccg gcg caa ggc gtg gee atg gtg tea ggg ccg ccg gec tat gtc gcg 480 

ffc Ala Gln Gly Val Met Val Ser Gly Pro Pro Ala T Y r Val Ala 

155 160 



ctg gtc gag cag gcg gta gcg gcg ttg ccc aag ggg gec ggc aat cag 528 
Leu Val Glu Gin Ala Val Ala Ala Leu Pro Lys Gly Ala Gly Asn Gin 
165 170 



175 

cag gtg gcg gtg ttt cgc etc aag cat get tec gtg age gac egg gtg 576 
Gin Val Ala Val Phe Arg Leu Lys His Ala Ser Val Ser Asp Arq Val 
180 185 190 

ate cgt tat cga gac cag cag gta gtt acg ccg ggg atg gec acc atg 624 
He Arg Tyr Arg Asp Gin Gin Val Val Thr Pro Gly Met ala Thr Met 
195 200 205 

ctg cgc caa ttg ate ctg ggg gcg ggg ccg ggc aac gac gcg gcg ctg 672 
Leu Arg Gin Leu He Leu Gly Ala Gly Pro Gly Asn Asp Ala Ala Leu 
210 215 220 

gee gcg gtg gcg gcg ccg ctg egg gaa aat ccg ccg gtg ttc ggc gat 720 
Ala Ala Val Ala Ala Pro Leu Arg Glu Asn Pro Pro Val Phe Glv Aso 
225 230 235 240 

gcg gca get gac ggg aac gcg ccg etc get ggc gca gee cag gca gec 768 
Ala Ala Ala Asp Gly Asn Ala Pro Leu Ala Gly Ala Ala Gin Ala Ala 
245 250 255 

ggc egg cgc ctg age gag ccc age gtg cag gee gac acg cgc etc aat 816 
Gly Arg Arg Leu Ser Glu Pro Ser Val Gin Ala Asp Thr Arg Leu Asn 
260 265 270 

gec ttg ate gtg cag gat att ccc gaa egg atg cca ate tac cgt gec 864 
Ala Leu He Val Gin Asp He Pro Glu Arg Met Pro He Tyr Arq Ala 
275 280 285 



ctg ate gag cag ttg gat gtg ccc age acc ctg ate gaa ata gag gec 912 
<*3 Leu He Glu Gin Leu Asp Val Pro Ser Thr Leu He Glu He Glu Ala 
290 295 300 



960 



tgg ggg gcg cag ate gga acc acc age ctg ggc tat ggc gat ctg ggg 

Trp Gly Ala Gin He Gly Thr Thr Ser Leu Gly Tyr Gly Asp Leu Gly 
325 330 335 

ctg cgt ccc ggc aac ggc ctg ccc gtg gac ggc gcg gcg gee gac ctg 

Leu Arg Pro Gly Asn Gly Leu Pro Val Asp Gly Ala Ala Ala Asp Leu 
340 345 350 

gcg ccc gga acc ttg ggg ate agt gtc agt acc egg ctg gcg gcg cgc 1104 



1008 



1056 
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Ala Pro Gly Thr Leu Gly He Ser Val Ser Thr Arg Leu Ala Ala Arq 
355 360 365 

ttg cgt gcg ttg gag teg gac ggg cag gec aat ate ctg tct cag ccg 1152 

Leu Arg Ala Leu Glu Ser Asp Gly Gin Ala Asn He Leu Ser Gin Pro 
3 7 0 375 380 



tec ate ctg acc gec gac aac etc ggc gec atg ata gac ctg teg gat 
Ser He Leu Thr Ala Asp Asn Leu Gly Ala Met He Asp Leu Ser Asp 



10 385 390 



395 400 



age ate age acg ctg gcg gtg gtg ggg gac gag cag acg ctg ctg ate 
Ser He Ser Thr Leu Ala Val Val Gly Asp Glu Gin Thr Leu Leu He 
30 465 470 475 480 



1200 



acc ttc tac att cgc acc ctg ggc gag cgc gta gcg aca gtc acg cct 1248 

Thr Phe Tyr He Arg Thr Leu Gly Glu Arg Val Ala Thr Val Thr Pro 
405 410 415 

gtc acg gtg ggt acg teg ttg cgt gtg acg ccg cgc tat ate gec gec 1296 

Val Thr Val Gly Thr Ser Leu Arg Val Thr Pro Arg Tyr He Ala Ala 
420 425 430 

aag gga gga cgc cag gtg gaa ttg gcg ate gat ate gag gac gga egg 1344 

Lys Gly Gly Arg Gin Val Glu Leu Ala He Asp He Glu Asp Gly Arq 
435 440 445 

gtc ttg cag gag tat ccc ate gat ggt ctg ccc egg gtt egg aaa age 1392 

Val Leu Gin Glu Tyr Pro He Asp Gly Leu Pro Arg Val Arg Lys Ser 

450 455 4 6 o 



1440 



ggc ggc tac aac aat cgc cgt gac gaa gag cag gtc gag aaa gtg ccg 1488 
Gly Gly Tyr Asn Asn Arg Arg Asp Glu Glu Gin Val Glu Lys Val Pro 
485 490 495 

ctg ctg gga gat ate ccc ggc ctg ggg ttc ttg ttc teg age aag tec 1536 
Leu Leu Gly Asp He Pro Gly Leu Gly Phe Leu Phe Ser Ser Lys Ser 
500 505 510 

egg gcg gta cag cgc cgc gag egg ctg ttc ctg ate egg ccg cgt gtc 1584 
Arg Ala Val Gin Arg Arg Glu Arg Leu Phe Leu He Arg Pro Arg Val 
515 520 525 

gtg get ate gag ggc aag ccg gtc ttc age ccc gtt gcg ggc acg teg 1632 
Val Ala He Glu Gly Lys Pro Val Phe Ser Pro Val Ala Gly Thr Ser 
530 535 540 



cag gtg ttc atg age acg ggt tgg ggc ggg cat ggc age age ctg age 1680 

Gin Val Phe Met Ser Thr Gly Trp Gly Gly His Gly Ser Ser Leu Ser 
50 545 550 555 560 

att gca ccc ggc gag ggc ggg cat aca caa gtg cgt cat gat gee egg 1728 

He Ala Pro Gly Glu Gly Gly His Thr Gin Val Arg His Asp Ala Arq 



Arg His Asp Ala Arg 
565 570 575 

gcg ggc agg ccg gtc egg ctg gtg ccg gat tea ttg cat gtg gag tat 1776 
Ala Gly Arg Pro Val Arg Leu Val Pro Asp Ser Leu His Val Glu Tyr 
5 80 585 590 

ggc gag gcg ggg gag gcg teg ccc tga 1 803 
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Gly Glu Ala Gly Glu Ala Ser Pro 
595 600 



<210> 6 
<211> 600 
<212> PRT 

<213> Bordetella pertussis 



10 <400> 6 

Met ala lie Gly Arg Leu Gly Tyr Leu Val Arg Gly Ala Trp Ala Gly 

15 10 15 

Gly Val Met Leu Leu Ala Ala Gly Ser Ala Trp Ala Ala Pro Asn Trp 

20 25 30 

Pro Leu Ala Pro Tyr Ser Tyr Tyr Ala Gin Gin Gin Ser Leu Ser Asp 

35 40 45 

Val Leu Arg Glu Phe Ala Ala Gly Phe Ser Leu Ala Leu Gin Gin Gly 

50 55 60 

Lys Gly Val Gin Gly Val Val Asn Gly Arg Phe Asn Ala Arq Thr Pro 
20 65 70 75 80 

Thr Glu Phe lie Glu Arg Leu Ser Gly He Tyr Gly Phe Asn Trp Phe 

85 90 95 

Val His Ala Gly Thr Leu Tyr Val Ser Arg Thr Ser Asp Val Val Thr 

100 105 110 

Arg Ala Val Asp Ala Ala Gly Ala Ser Pro Ser Ala Leu Arg Gin Ala 

H5 120 125 

Leu Leu Gin Leu Gly He Leu Asp Glu Arg Phe Gly Trp Gly Glu Leu 

130 135 140 

Pro Ala Gin Gly Val Ala Met Val Ser Gly Pro Pro Ala Tyr Val Ala 
30 145 150 155 160 

Leu Val Glu Gin Ala Val Ala Ala Leu Pro Lys Gly Ala Gly Asn Gin 

165 170 175 

Gin Val Ala Val Phe Arg Leu Lys His Ala Ser Val Ser Asp Arg Val 

180 185 190 

He Arg Tyr Arg Asp Gin Gin Val Val Thr Pro Gly Met ala Thr Met 

195 200 205 

Leu Arg Gin Leu He Leu Gly Ala Gly Pro Gly Asn Asp Ala Ala Leu 

210 215 220 

Ala Ala Val Ala Ala Pro Leu Arg Glu Asn Pro Pro Val Phe Gly Asp 
40 225 230 235 240 

Ala Ala Ala Asp Gly Asn Ala Pro Leu Ala Gly Ala Ala Gin Ala Ala 

245 250 255 

Gly Arg Arg Leu Ser Glu Pro Ser Val Gin Ala Asp Thr Arg Leu Asn 

2< 50 265 270 

Ala Leu He Val Gin Asp He Pro Glu Arg Met Pro He Tyr Arg Ala 

275 280 285 

Leu He Glu Gin Leu Asp Val Pro Ser Thr Leu He Glu He Glu Ala 

2 *0 295 300 

Met He Val Asp Val Asn Thr Asp Leu Val Asn Glu Leu Gly Val Thr 
50 305 310 315 320 

Trp Gly Ala Gin He Gly Thr Thr Ser Leu Gly Tyr Gly Asp Leu Gly 

325 330 335 

Leu Arg Pro Gly Asn Gly Leu Pro Val Asp Gly Ala Ala Ala Asp Leu 
340 345 350 

55 Ala Pro Gly Thr Leu Gly He Ser Val Ser Thr Arg Leu Ala Ala Arg 
355 360 365 

Leu Arg Ala Leu Glu Ser Asp Gly Gin Ala Asn He Leu Ser Gin Pro 
370 375 380 

*n foe Ile LeU Thr Ala Asp Asn Leu G1 V Ala Met He Asp Leu Ser Asp 
60 385 390 395 400 
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<210> 7 

<211> 1281 

30 <212> DNA 

<213> Bordetella pertussis 

<220> 

<221> CDS 

35 <222> (1) . . . (1281) 

<400> 7 

atg acg acg gcg ctg gaa ttc cgc gtg ctt tea ggc gca cag tgc atg 48 

Met Thr Thr Ala Leu Glu Phe Arg Val Leu Ser Gly Ala Gin Cys Met 
40 1 5 10 15 

gcg cgc tgc ccg gec gtg cat ggc gcg cgc gtg ggc gec aat ccg cat 96 

Ala Arg Cys Pro Ala Val His Gly Ala Arg Val Gly Ala Asn Pro His 
20 25 30 

tgc gat ate gtc ctg acc ggc gag gac atg ccc gaa gtg gcg gga tgg 144 

Cys Asp He Val Leu Thr Gly Glu Asp Met Pro Glu Val Ala Gly Trp 
35 40 45 



45 



50 ctg gag ate gac cag tec ggc tgg egg ttg gec ggc gec gtg acg ccc 192 

Leu Glu He Asp Gin Ser Gly Trp Arg Leu Ala Gly Ala Val Thr Pro 

50 55 60 

ggc ctg gac gec cag gcg ccg tgt ccg ccc gcg gec ttc aac gaa ccc 240 

55 Gly Leu Asp Ala Gin Ala Pro Cys Pro Pro Ala Ala Phe Asn Glu Pro 

65 70 75 80 

gta gag ctg gga gec gec tgg ate acc gtg gee gec cct tec gcg ccg 288 

Val Glu Leu Gly Ala Ala Trp He Thr Val Ala Ala Pro Ser Ala Pro 

60 85 90 95 
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tgg ccc gcg ccg ccg gag ccg tgc ggc ccg gac ggc age gac aca gec 336 

Trp Pro Ala Pro Pro Glu Pro Cys Gly Pro Asp Gly Ser Asp Thr Ala 

100 105 110 

ttg cac gac gtc cct ggc teg aca age ccg ccg tec gtc get gee etc 384 

Leu His Asp Val Pro Gly Ser Thr Ser Pro Pro Ser Val Ala Ala Leu 
115 120 125 

atg ccg cgc cga cgt gca gga egg ccc tgg ctg gcg ctg ggc gcg gec 432 

Met Pro Arg Arg Arg Ala Gly Arg Pro Trp Leu Ala Leu Gly Ala Ala 
130 135 140 



gcg gec gtc ctg ctg gtc ggc ctg gec acg gcg ctg gtt tec gtg ace 480 

15 Ala Ala Val Leu Leu Val Gly Leu Ala Thr Ala Leu Val Ser Val Thr 
I 45 150 155 160 

aca ccc gee acg ccg ccg gec gcg ccg ccc cca acg ccc ace gcg ccg 528 

Thr Pro Ala Thr Pro Pro Ala Ala Pro Pro Pro Thr Pro Thr Ala Pro 

20 165 170 175 

ctg gtc cgc gee gcg gcg etc ate gac age ctg ggc ctt acc gag caa 576 

Leu Val Arg Ala Ala Ala Leu He Asp Ser Leu Gly Leu Thr Glu Gin 



180 185 190 

tta caa gcg gec tac ggc cgt ggc ggc gtg etc acc gtg acc gga tgg 624 

Leu Gin Ala Ala Tyr Gly Arg Gly Gly Val Leu Thr Val Thr Gly Trp 

1^5 200 205 

gtg cac gac gag acg gaa ttc get egg gtc gee agg gcg ttg gcg caa 672 

Val His Asp Glu Thr Glu Phe Ala Arg Val Ala Arg Ala Leu Ala Gin 

210 215 220 

ctt gcg cca egg cct gee atg cag gta age agg cag gac gag gec agg 720 

Leu Ala Pro Arg Pro Ala Met Gin Val Ser Arg Gin Asp Glu Ala Arg 

225 230 235 240 

gee ctg gee tgc gat gtc ctg gcg aca ttc ggg gtg cgc tac atg gcg 768 

Ala Leu Ala Cys Asp Val Leu Ala Thr Phe Gly Val Arg Tyr Met ala 



40 245 



250 255 



cgc ccg tac ggc aat ggc cgc ctg gcg ate teg ggc ate gec age gat 816 

Arg Pro Tyr Gly Asn Gly Arg Leu Ala He Ser Gly He Ala Ser Asp 
260 265 270 

gcg cac gaa cgc gee gcg gcg ctg cat gcg gtg cgc atg cgc ctg ccg 8 64 

Ala His Glu Arg Ala Ala Ala Leu His Ala Val Arg Met Arg Leu Pro 
275 280 285 

ggc atg acg ate etc ggt cgc gat gta cgc ctg gec gac gag gtc teg 912 

Gly Met Thr lie Leu Gly Arg Asp Val Arg Leu Ala Asp Glu Val Ser 
290 295 300 

gec cag ttc gcg gec cag ctg gec gac gaa cgc etc gac ggc gtc aag 960 

Ala Gin Phe Ala Ala Gin Leu Ala Asp Glu Arg Leu Asp Gly Val Lys 
305 310 315 320 



etc age tgg cac gee gac cgc ctg gac gca gat ccc ggc gga ttg gcg 1008 
Leu Ser Trp His Ala Asp Arg Leu Asp Ala Asp Pro Gly Gly Leu Ala 
60 325 330 335 
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gca ggc cgc atg gcg cgc ctg cgc gag ctg gtg gcc gcg ttc aac cag 1056 

Ala Gly Arg Met ala Arg Leu Arg Glu Leu Val Ala Ala Phe Asn Gin 
340 345 350 

cgc aac tac gac gtc gtc egg ctg ccg gcc acc gcc gcg cgc gcg acg 1104 

Arg Asn Tyr Asp Val Val Arg Leu Pro Ala Thr Ala Ala Arg Ala Thr 
355 360 365 

10 egg gat cac gtg ccg ttc gag ata cgc agt gtc gtg age ggc ccg caa 1152 

Arg Asp His Val Pro Phe Glu lie Arg Ser Val Val Ser Gly Pro Gin 
370 375 380 

ccg tac ctg atg ctg gcc gat ggc age cgc etc ctg gtg ggc gga ctg 1200 

15 Pro Tyr Leu Met Leu Ala Asd Gly Ser Arg Leu Leu Val Gly Gly Leu 

385 390 395 400 

egg gac cag tat cgc ctt acc gcc ate gaa tec ggc cgc ctg gtc ttc 1248 

Arg Asp Gln Tyr Arg Leu Thr Ala Ile Giu Ser G1 V Ar 9 Leu Val phe 

20 405 410 415 



25 



40 



45 



50 



55 



60 



gat ggt ccc gaa ccg gtc ate gtg acg cga tga 1281 
Asp Gly Pro Glu Pro Val Ile Val Thr Arg * 
420 425 



<210> 8 

<211> 426 

<212> PRT 

30 <213> Bordetella pertussis 
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<210> 9 
<211> 300 
30 <212> DNA 

<213> Bordetella pertussis 

<220> 
<221> CDS 
35 <222> (1) . . . (300) 



<400> 9 

atg agt aca tct gtt eta gec ctg acc gaa ctg gaa gtg cgc ctg gca 48 

Met Ser Thr Ser Val Leu Ala Leu Thr Glu Leu Glu Val Arg Leu Ala 
40 1 5 10 is 



45 



50 



teg ccg ggc ggt tec gee ttg cgc gac acc ttg ctg teg cag ctt ggc 96 
Ser Pro Gly Gly Ser Ala Leu Arg Asp Thr Leu Leu Ser Gin Leu Gly 
20 25 30 

gaa ctg gag aca cgt ttg cgc gtc cgc ctg cac gat ggc gtg ggg cgc 144 
Glu Leu Glu Thr Arg Leu Arg Val Arg Leu His Asp Gly Val Gly Arg 
35 40 45 

gac acc tat ccc gta tgg cgc gac gcg ctg gcg gec gec acc gcg gec 192 
Asp Thr Tyr Pro Val Trp Arg Asp Ala Leu Ala Ala Ala Thr Ala Ala 
50 55 60 



55 



egg cag gta ttg ctg cag cgc ccg acc ggg ccg gac aac cct ccg gcg 
Arg Gin Val Leu Leu Gin Arg Pro Thr Gly Pro Asp Asn Pro Pro Ala 
65 70 75 80 



240 



tea gtc ttg acg cgc ctg age aat gaa caa tgc gec gaa gga gac aag 288 
Ser Val Leu Thr Arg Leu Ser Asn Glu Gin Cys Ala Glu Gly Asp Lys 
60 8 5 90 95 
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cat ggc cat taa 
His Gly His * 



<210> 10 
<211> 99 
<212> PRT 

10 <213> Bordetella pertussis 

<400> 10 

Met Ser Thr Ser Val Leu Ala Leu Thr Glu Leu Glu Val Arg Leu Ala 

1 5 10 15 

Ser Pro Gly Gly Ser Ala Leu Arg Asp Thr Leu Leu Ser Gin Leu Gly 

20 25 30 

Glu Leu Glu Thr Arg Leu Arg Val Arg Leu His Asp Gly Val Gly Arg 

35 40 45 

Asp Thr Tyr Pro Val Trp Arg Asp Ala Leu Ala Ala Ala Thr Ala Ala 

20 50 55 60 

Arg Gin Val Leu Leu Gin Arg Pro Thr Gly Pro Asp Asn Pro Pro Ala 

65 "70 75 80 

Ser Val Leu Thr Arg Leu Ser Asn Glu Gin Cys Ala Glu Gly Asp Lys 

~<- . 85 90 95 

25 His Gly His 



300 



15 



<210> 11 
<211> 267 
30 <212> DNA 

<213> Bordetella pertussis 

<220> 
<221> CDS 
35 <222> (1) . . . (267) 

<400> 11 

atg gcc att aac ctg gga ggc gac gca ggc cga gtg acc atg cag age 4 8 

Met ala He Asn Leu Gly Gly Asp Ala Gly Arg Val Thr Met Gin Ser 
W 1 5 10 15 



45 



50 



60 



gtc aac cag gcg gtc aat acg egg ctg aac get cac gaa cgc gac ctg 
Val Asn Gin Ala Val Asn Thr Arg Leu Asn Ala His Glu Arg Asp Leu 
20 25 30 



acg tec gac ctg ctg ate gtg caa cag gaa atg caa teg tgg gtc gtg 
Thr Ser Asp Leu Leu He Val Gin Gin Glu Met Gin Ser Trp Val Val 
50 55 6 o 



96 



cgc age cgc ctg gag gcg etc age gcg cgc gga gac ggc gcg gtc age 14 4 

Arg Ser Arg Leu Glu Ala Leu Ser Ala Arg Gly Asp Gly Ala Val Ser 
35 40 45 



192 



t 9 a ^ C 9at ° ta Cag agc acg gtc aag cag gtc gcg gat teg etc 240 

D Met He Asp Leu Gin Ser Thr Val Val Lys Gin Val Ala Asp Ser Leu 
65 70 75 80 



aag ggc gtc ata cag aag gcg agt tga 267 
Lys Gly Val lie Gin Lys Ala Ser * 
85 
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<211> 88 

<212> PRT 

<213> Bordetella pertussis 
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<210> 14 
<211> 108 
<212> PRT 

<213> Bordetella pertussis 
<400> 14 

Met ala Asp Gin Ala Arg Phe Glu Leu Ala Leu Gly Glu Met Pro Gly 

15 10 15 

Ala Ser Ala Pro Asn Gly Ala He Ala Leu Ala Pro Val Ala Leu Asp 
10 20 25 30 

Glu Pro Leu Gly Arg Arg He Leu Gly Gin Leu Arg Gly Gly Leu Ala 

35 40 45 

Asp Val Ala Gly Lys Trp Arg Ala Val Gin Thr Gly Leu Ala Glu Val 

50 55 60 

Ser Gin Ala Pro Thr Val Val Gly Met Leu Asp Leu Gin Ala Arg Leu 
65 70 75 80 

Leu Gin Ala Ser Val Glu Tyr Glu Leu Val Gly Lys Ala He Gly Arg 

85 90 95 

Ala Thr Gin Asn Val Asp Thr Leu Ala Arg Met Ser 
20 100 105 

<210> 15 
<211> 825 
<212> DNA 

25 <213> Bordetella pertussis 

<220> 
<221> CDS 
<222> {!)... (825) 



15 



30 



35 



40 



45 



55 



60 



<400> 15 

atg aac gcc ate ggg gcg ate caa egg tat egg cgc ggc gcg gga tgg 48 

Met Asn Ala He Gly Ala He Gin Arg Tyr Arg Arg Gly Ala Gly Trp 

1 5 10 15 

gcg gcc ctg gtg etc gcc ctg gcg ctg ctg gcc ggc tgc ggt gcc cgc 96 

Ala Ala Leu Val Leu Ala Leu Ala Leu Leu Ala Gly Cys Gly Ala Arg 

20 25 30 



gtc gag ctg ttg ggc gcg gcg ccc gag aac gaa gcc aac gaa gta ttg 
Val Glu Leu Leu Gly Ala Ala Pro Glu Asn Glu Ala Asn Glu Val Leu 
35 40 45 



144 



gcg gcg ctg etc gag gca ggc ate get gcg cag aag cag tec ggc aag 192 
Ala Ala Leu Leu Glu Ala Gly He Ala Ala Gin Lys Gin Ser Gly Lys 
50 55 60 



gcc ggc tac gcg gtt teg gtg ccg gcc gag gcg gtg gcc egg teg ctg 24 0 

Ala Gly Tyr Ala Val Ser Val Pro Ala Glu Ala Val Ala Arg Ser Leu 
50 65 70 75 80 



gag ate ctg cgc gca age ggc ctg ccc cgc gag cag ttc gac gga atg 288 

Glu He Leu Arg Ala Ser Gly Leu Pro Arg Glu Gin Phe Asp Gly Met 

85 90 95 

gga cgc ata ttc cgc aag gaa ggc ctg gtt tea tct ccg etc gaa gag 336 

Gly Arg He Phe Arg Lys Glu Gly Leu Val Ser Ser Pro Leu Glu Glu 

100 105 110 

cgc gcc cgc tac att tat gcg ctg tct cag gaa ttg gcc gac acc ctg 384 
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Arg Ala Arg Tyr He Tyr Ala Leu Ser Gin Glu Leu Ala Asp Thr Leu 

115 120 125 

teg cag ate gac ggc gtg etc age gee cgc gtg cac gtg gtg ctt ccc 432 

5 Ser Gin He Asp Gly Val Leu Ser Ala Arg Val His Val Val Leu Pro 

130 135 140 

gaa cgc ggc gcg gtc ggc gag ccg gee acc cct teg acg gca ggg gtg 4 80 

Glu Arg Gly Ala Val Gly Glu Pro Ala Thr Pro Ser Thr Ala Gly Val 

!0 145 150 155 160 



15 



20 



25 



35 



ttt etc aag tac cgc gac gga cag age etc gac gcg etc gtg ccc gag 528 
Phe Leu Lys Tyr Arg Asp Gly Gin Ser Leu Asp Ala Leu Val Pro Glu 
165 170 175 

ate cgc aag ctg gtc acg cat gee ate ccg ggc ctg gee gag gac cgt 576 
lie Arg Lys Leu Val Thr His Ala He Pro Gly Leu Ala Glu Asp Arg 
180 185 190 

gta teg gtt gee ctg gtg gtg gee cag ccc gtt cag gec gca ccc gcg 624 
Val Ser Val Ala Leu Val Val Ala Gin Pro Val Gin Ala Ala Pro Ala 
195 200 205 

ccg gtc gcg tgg cgc cgc gtg ctt ggc gta cag gtc gcg gac gga teg 672 
Pro Val Ala Trp Arg Arg Val Leu Gly Val Gin Val Ala Asp Gly Ser 
210 215 220 



gtc ctg aga ttt teg ctg ttg ctg ctg ttg ttg ccg gtg ctg tgc ctg 720 
Val Leu Arg Phe Ser Leu Leu Leu Leu Leu Leu Pro Val Leu Cys Leu 
30 225 230 235 240 



ata gtg gcg ggg gee gcg etc tac gtc tgg cgc acg cgc tgg tec cgc 768 

lie Val Ala Gly Ala Ala Leu Tyr Val Trp Arg Thr Arg Trp Ser Arg 
245 250 255 

ggc gaa ggg cgc ggc ggc get ggc gee ggc gee acg gaa gga gee ggg 816 

Gly Glu Gly Arg Gly Gly Ala Gly Ala Gly Ala Thr Glu Gly Ala Gly 
260 265 270 



40 cat gac tga 
His Asp * 



45 <210> 16 

<211> 274 
<212> PRT 

<213> Bordetella pertussis 



50 <400> 16 

Met Asn Ala He Gly Ala He Gin Arg Tyr Arg Arg Gly Ala Gly Trp 

1 5 10 15 

Ala Ala Leu Val Leu Ala Leu Ala Leu Leu Ala Gly Cys Gly Ala Arg 
„ 20 25 30 

CO Val Glu Leu Leu Gly Ala Ala Pro Glu Asn Glu Ala Asn Glu Val Leu 
35 40 45 

Ala Ala Leu Leu Glu Ala Gly He Ala Ala Gin Lys Gin Ser Gly Lys 
50 55 60 

*n Gly Tyr Ala Val Ser Val Pro Ala Glu Ala Val Ala Arg Ser Leu 

60 65 70 75 80 



825 
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<210> 17 
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30 <212> DNA 

<213> Bordetella pertussis 
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Ala 
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Arg 
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His 


tgg 
Trp 

55 


teg 
Ser 
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ate 
He 
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Leu 
60 
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Leu 


ggc 
Gly 


55 
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Leu 
65 
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Gin 
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Pro 


gtc 
Val 
70 


ctg 
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Pro 
75 
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Pro 
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Val 
80 


60 
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gcg ctg ctg tgc gcg ccg cgc ctg cga cgc gcg ata gac ggc gcc gag 336 

Ala Leu Leu Cys Ala Pro Arg Leu Arg Arg Ala lie Asp Gly Ala Glu 
100 105 110 

gtc cgt acc ttg cat gcc gcg etc ggg cgc gat gtg atg aat ttc gcc 384 

Val Arg Thr Leu His Ala Ala Leu Gly Arg Asp Val Met Asn Phe Ala 
115 120 125 

10 gtg tct tec gcg gcg egg gcc ctg cat gac ggg etc gcc gcc agt teg 432 

Val Ser Ser Ala Ala Arg Ala Leu His Asp Gly Leu Ala Ala Ser Ser 

130 135 140 

gac tgg acc ctg gcc gcc acg gtc cag gcg gcg cag aaa ctg ggc tgg 480 

15 Asp Trp Thr Leu Ala Ala Thr Val Gin Ala Ala Gin Lys Leu Gly Trp 

145 150 155 160 

gcc ctg ctg cgc gac gcc gtg cag ggc gcc gcc gac gag ata gcg ctg 528 

Ala Leu Leu Arg Asp Ala Val Gin Gly Ala Ala Asp Glu lie Ala Leu 
20 165 170 175 



25 



30 



cgt tgc gcg ctg aag ttg ccg cgc gac ctt gat ccc gcg ccc gtc ctg 576 

Arg Cys Ala Leu Lys Leu Pro Arg Asp Leu Asp Pro Ala Pro Val Leu 

180 185 190 

ccg ccc gag gcg gcg ctt gcg ctg gtg ctg tec atg etc gaa ate ctg 624 

Pro Pro Glu Ala Ala Leu Ala Leu Val Leu Ser Met Leu Glu lie Leu 

195 200 205 

gat gca gaa tgg ctt tec teg ttc ccc gcc caa gcc tga 663 

Asp Ala Glu Trp Leu Ser Ser Phe Pro Ala Gin Ala * 
210 215 220 
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Ala Leu Leu Arg Asp Ala Val Gin Gly Ala Ala Asp Glu lie Ala Leu 

165 170 175 

Arg Cys Ala Leu Lys Leu Pro Arg Asp Leu Asp Pro Ala Pro Val Leu 
180 185 190 

5 Pro Pro Glu Ala Ala Leu Ala Leu Val Leu Ser Met Leu Glu lie Leu 
195 200 205 

Asp Ala Glu Trp Leu Ser Ser Phe Pro Ala Gin Ala 
210 215 220 

10 <210> 19 

<211> 639 

<212> DNA 

<213> Bordetella pertussis 

15 <220> 

<221> CDS 

<222> (1) . . . (639) 

<400> 19 

20 atg get ttc etc gtt ccc cgc cca age ctg ate cag gcg gta egg ccc 48 

Met aia Phe Leu Val Pro Arg Pro Ser Leu lie Gin Ala Val Arg Pro 
15 10 15 

ggc cgt gcg gat ccc gcg acc gac gtc ttg cgc get gaa gac tac gec 96 
25 Gly Arg Ala Asp Pro Ala Thr Asp Val Leu Arg Ala Glu Asp Tyr Ala 

20 25 30 

gag ctg etc age gee gcg cag ate gtt gee cag gca cac egg egg gee 14 4 

Glu Leu Leu Ser Ala Ala Gin lie Val Ala Gin Ala His Arg Arg Ala 
30 35 40 45 

ggc gaa ate gtg gee gag gcg cga gag gag ttc gag cgc gag cgc agg 192 
Gly Glu lie Val Ala Glu Ala Arg Glu Glu Phe Glu Arg Glu Arg Arg 
50 55 60 



35 



55 



60 



cga ggc tat gag gag ggg cgc cgc gaa gcg ctt acg gat cag gcg gag 240 
Arg Gly Tyr Glu Glu Gly Arg Arg Glu Ala Leu Thr Asp Gin Ala Glu 
65 70 75 80 



40 aag atg ata gaa acc gta age cgc acg ate gac tac ttc gcg ggt ate 288 

Lys Met lie Glu Thr Val Ser Arg Thr lie Asp Tyr Phe Ala Gly He 
85 90 95 

gag aac gag atg ate gaa ctg gtc atg agt gcg gtc cgc aag ate gtc 336 

45 Glu Asn Glu Met He Glu Leu Val Met Ser Ala Val Arg Lys He Val 

100 105 110 

gac ggt tac gac gac cgc gag cgc acc gtg ate gee gtg cgc aac gca 384 

Asp Gly Tyr Asp Asp Arg Glu Arg Thr Val He Ala Val Arg Asn Ala 
50 115 120 125 

ttg gcg gtc gtg cgc aat cag cgc cag atg acc ttg cgc ctg cac cca 432 

Leu Ala Val Val Arg Asn Gin Arg Gin Met Thr Leu Arg Leu His Pro 
130 135 140 



gac gag gtg gat gtg etc egg gaa ggc atg aac cag ctt ctg gcg gec 480 
Asp Glu Val Asp Val Leu Arg Glu Gly Met Asn Gin Leu Leu Ala Ala 
145 150 155 160 



tat ccg ggc gtg ggc tac ctg gac ctg ctg ccc gac gee agg ctg gcg 



528 
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Tyr Pro Gly Val Gly Tyr Leu Asp Leu Leu Pro Asp Ala Arg Leu Ala 
165 170 175 

ccg gga gcc tgc ata ctg gag age gag ata ggc atg gtc gag gec age 576 
5 Pro Gly Ala Cys lie Leu Glu Ser Glu lie Gly Met Val Glu Ala Ser 

180 185 190 

etc gag gac cag ctg tgc gcc ttg egg gcg gcc ttc gaa cgt aca ttc 624 
Leu Glu Asp Gin Leu Cys Ala Leu Arg Ala Ala Phe Glu Arg Thr Phe 
10 195 200 205 

ggc egg cgc gga tag 639 

Gly Arg Arg Gly * 
210 

15 

<210> 20 

<211> 212 

<212> PRT 

20 <213> Bordetella pertussis 
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<400> 21 

atg cgt cag tac cac tac ate acg gag atg atg egg gtg gec ctg cag 48 

Met Arg Gin Tyr His Tyr lie Thr Glu Met Met Arg Val Ala Leu Gin 

15 10 15 

5 

gat ctg tec acg ctg egg ata aag ggc egg gtg gtg caa gtg gtg gga 96 

Asp Leu Ser Thr Leu Arg lie Lys Gly Arg Val Val Gin Val Val Gly 

20 25 30 

10 acg ate ate aag gee gtc gtt ccg atg gtc aag ate ggc gaa gtg tgc 144 

Thr lie lie Lys Ala Val Val Pro Met Val Lys lie Gly Glu Val Cys 

35 40 45 

ctg ctg cgc aat ccc ggc gag gac ttc gag atg cac ggc gaa gtg gtg 192 

15 Leu Leu Arg Asn Pro Gly Glu Asp Phe Glu Met His Gly Glu Val Val 

50 55 60 

ggc ttt gtc cgc gac gee gec ttg etc acg cct ate ggc gac atg tac 240 

Gly Phe Val Arg Asp Ala Ala Leu Leu Thr Pro lie Gly Asp Met Tyr 

20 65 70 75 80 

ggg att tec teg gcg ace gag gtg ata ccg ace gga cgc acg cat atg 288 

Gly lie Ser Ser Ala Thr Glu Val lie Pro Thr Gly Arg Thr His Met 



25 



45 



85 90 95 

gtc ccc gtc ggt ccg ggc ttg ctg gga cgc gtg ctg gac ggg ctg gga 336 
Val Pro Val Gly Pro Gly Leu Leu Gly Arg Val Leu Asp Gly Leu Gly 
100 105 110 



30 cgt ccg ctg gac gee gee gag tea ggg ccg ctg eat gee cac aag ttc 384 
Arg Pro Leu Asp Ala Ala Glu Ser Gly Pro Leu His Ala His Lys Phe 
115 120 125 

tat ccg gtc ttc gec gat gcg cca gac ccg ctg acg cgt cgc ate ate 432 
35 Tyr Pro Val Phe Ala Asp Ala Pro Asp Pro Leu Thr Arg Arg lie lie 
130 135 140 

cat get ccg ctg gag ctg ggg gtg cgc gta ctg gac ggt ttg ctt aca 480 
His Ala Pro Leu Glu Leu Gly Val Arg Val Leu Asp Gly Leu Leu Thr 
40 145 150 155 160 

tgc ggg gaa ggc cag cgt ctg gga att ttc gca gec gec ggc ggc ggc 528 
Cys Gly Glu Gly Gin Arg Leu Gly lie Phe Ala Ala Ala Gly Gly Gly 



165 170 175 

aag teg ace ctg ctg ggc atg ctg gtc aag ggc gec gcg gtc gac gtg 57 6 

Lys Ser Thr Leu Leu Gly Met Leu Val Lys Gly Ala Ala Val Asp Val 
180 185 190 



50 acg gtg gtg gcg ctg ate ggc gag cgt ggg egg gaa gtt cgc gag ttc 624 

Thr Val Val Ala Leu lie Gly Glu Arg Gly Arg Glu Val Arg Glu Phe 

195 200 205 

ctt gag cac gaa etc ggt ccg gag ggc aga cgc aag age gtg ate gtc 672 

55 Leu Glu His Glu Leu Gly Pro Glu Gly Arg Arg Lys Ser Val lie Val 
210 215 220 

tgc gcg acc age gac aag tec teg atg gag cgt gec aag gcg gcg tac 720 

Cys Ala Thr Ser Asp Lys Ser Ser Met Glu Arg Ala Lys Ala Ala Tyr 
60 225 230 235 240 
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30 



45 



50 



gtc gca acc gcc ate gec gaa tac ttc cgc gat caa ggg cag cgt gta 768 

Val Ala Thr Ala lie Ala Glu Tyr Phe Arg Asp Gin Gly Gin Arg Val 
245 250 255 

ctt ttt ctg atg gac teg gtc acc cgc ttt gcg cga gcc cag cgt gaa 816 

Leu Phe Leu Met Asp Ser Val Thr Arg Phe Ala Arg Ala Gin Arg Glu 
260 265 270 



10 ate ggc ttg gcg gca ggc gag ccg ccg acg egg cgc ggc tat cca ccg 
lie Gly Leu Ala Ala Gly Glu Pro Pro Thr Arg Arg Gly Tyr Pro Pro 
275 280 285 



395 400 



864 



teg gtg ttc gcc acc ttg ccc aaa ctg atg gag cgc gcc ggc atg aac 912 
15 Ser Val Phe Ala Thr Leu Pro Lys Leu Met Glu Arg Ala Gly Met Asn 

290 295 300 

cag acg ggt teg ate acg gcg ctg tat acg gtg ctg gtc gag ggg gac 960 
Gin Thr Gly Ser lie Thr Ala Leu Tyr Thr Val Leu Val Glu Gly Asp 
20 305 310 315 320 

gac atg aac gaa ccg gtg gcc gac gag acg cgt teg ata ctg gac ggc 1008 
Asp Met Asn Glu Pro Val Ala Asp Glu Thr Arg Ser lie Leu Asp Gly 
25 325 330 335 

cac ate gtg etc teg cgc aag ctg gga gcg gcg aat cac tat cct gcc 1056 
His lie Val Leu Ser Arg Lys Leu Gly Ala Ala Asn His Tyr Pro Ala 
340 345 350 

gtc gac gtg ctg gcc teg gcc age egg gtc atg aat gcc gtg gtg teg 1104 
Val Asp Val Leu Ala Ser Ala Ser Arg Val Met Asn Ala Val Val Ser 
355 360 365 

ccg cgt cac aag tac ctg gcc gga cgt atg cgc gaa ctg atg gcc aag 1152 
,35 Pro Arg His Lys Tyr Leu Ala Gly Arg Met Arg Glu Leu Met ala Lys 

370 375 380 

tac cag gat gtc gag ctg ttg gtg aaa ate ggc gag tac aag cag ggc 1200 
Tyr Gin Asp Val Glu Leu Leu Val Lys lie Gly Glu Tyr Lys Gin Gly 
40 385 390 



gcc gat gcg teg acc gat gag gcg ata cag aag ate gga cag ate aat 124 8 

Ala Asp Ala Ser Thr Asp Glu Ala lie Gin Lys lie Gly Gin lie Asn 
405 410 415 

gcg ttt etc aga caa eta acc gac gaa cgc gaa gca ttc gag gat acc 1296 

Ala Phe Leu Arg Gin Leu Thr Asp Glu Arg Glu Ala Phe Glu Asp Thr 
420 425 430 

gta ctg cgc atg get gaa ate ate gga ccc gaa tec taa 1335 

Val Leu Arg Met ala Glu lie He Gly Pro Glu Ser * 

435 440 



55 <210> 22 

<211> 444 
<212> PRT 

<213> Bordetella pertussis 
60 <400> 22 
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Met Arg Gin Tyr His Tyr He Thr Glu Met Met Arg Val Ala Leu Gin 

1 5 10 15 

Asp Leu Ser Thr Leu Arg He Lys Gly Arg Val Val Gin Val Val Gly 
<- 20 25 30 

0 Thr He He Lys Ala Val Val Pro Met Val Lys lie Gly Glu Val Cys 
35 40 45 

Leu Leu Arg Asn Pro Gly Glu Asp Phe Glu Met His Gly Glu Val Val 

5 ° 55 60 

Gly Phe Val Arg Asp Ala Ala Leu Leu Thr Pro lie Gly Asp Met Tvr 
10 65 70 75 80 

Gly He Ser Ser Ala Thr Glu Val He Pro Thr Gly Arg Thr His Met 

85 90 95 

Val Pro Val Gly Pro Gly Leu Leu Gly Arg Val Leu Asp Gly Leu Gly 
100 105 110 

IS Arg Pro Leu Asp Ala Ala Glu Ser Gly Pro Leu His Ala His Lys Phe 
115 120 125 

Tyr Pro Val Phe Ala Asp Ala Pro Asp Pro Leu Thr Arg Arg He He 

130 135 140 

His Ala Pro Leu Glu Leu Gly Val Arg Val Leu Asp Gly Leu Leu Thr 
20 145 150 155 160 

Cys Gly Glu Gly Gin Arg Leu Gly He Phe Ala Ala Ala Gly Gly Gly 

165 170 175 

Lys Ser Thr Leu Leu Gly Met Leu Val Lys Gly Ala Ala Val Asp Val 
«<- 180 185 190 

Thr Val Va l Ala Leu He Gly Glu Arg Gly Arg Glu Val Arg Glu Phe 

195 200 205 

Leu Glu His Glu Leu Gly Pro Glu Gly Arg Arg Lys Ser Val He Val 

210 215 220 

Cys Ala Thr Ser Asp Lys Ser Ser Met Glu Arg Ala Lys Ala Ala Tyr 
30 225 230 235 240 

Val Ala Thr Ala He Ala Glu Tyr Phe Arg Asp Gin Gly Gin Arg Val 

245 250 255 

Leu Phe Leu Met Asp Ser Val Thr Arg Phe Ala Arg Ala Gin Arg Glu 
260 265 270 

35 He Gly Leu Ala Ala Gly Glu Pro Pro Thr Arg Arg Gly Tyr Pro Pro 
275 280 285 

Ser Val Phe Ala Thr Leu Pro Lys Leu Met Glu Arg Ala Gly Met Asn 

290 295 300 

Gin Thr Gly Ser He Thr Ala Leu Tyr Thr Val Leu Val Glu Gly Asp 
40 305 310 315 320 

Asp Met Asn Glu Pro Val Ala Asp Glu Thr Arg Ser He Leu Asp Gly 

325 330 335 

His He Val Leu Ser Arg Lys Leu Gly Ala Ala Asn His Tyr Pro Ala 
A r 340 345 350 

43 Val Asp Val Leu Ala Ser Ala Ser Arg Val Met Asn Ala Val Val Ser 
355 360 365 

Pro Arg His Lys Tyr Leu Ala Gly Arg Met Arg Glu Leu Met ala Lys 
37 0 375 380 

ca l¥Z Gln ASP Val Glu Leu Leu Val L V S Ile G1 Y Glu Tyr Lys Gin Gly 
50 3 * 5 390 395 400 

Ala Asp Ala Ser Thr Asp Glu Ala He Gin Lys He Gly Gin Ile Asn 

405 410 415 

Ala Phe Leu Arg Gin Leu Thr Asp Glu Arg Glu Ala Phe Glu Asp Thr 

4 20 425 430 

Val Leu Arg Met ala Glu Ile Ile Gly Pro Glu Ser 
435 440 
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<213> Bordetella pertussis 

<220> 

<221> CDS 

5 <222> (1) . . . (510) 

<400> 23 

atg gac ctg gaa age ctg ctt gec ate aag cat ttt cgc gec gac caa 48 

Met Asp Leu Glu Ser Leu Leu Ala lie Lys His Phe Arg Ala Asp Gin 
10 1 5 10 15 

gec cag ctt gcg ctg aaa cgc caa cag cag gec tgc gcg gtt get gec 96 

Ala Gin Leu Ala Leu Lys Arg Gin Gin Gin Ala Cys Ala Val Ala Ala 
20 25 30 



15 



20 



25 



35 



40 



45 



gcg gcg cag cgt cag gcg caa ggc cgc etc gac gat tgt cgc ctg tgg 14 4 

Ala Ala Gin Arg Gin Ala Gin Gly Arg Leu Asp Asp Cys Arg Leu Trp 

35 40 45 

gec gga cag etc gaa aac cgt eta tat gec gag ctg tgc egg cgc ate 192 

Ala Gly Gin Leu Glu Asn Arg Leu Tyr Ala Glu Leu Cys Arg Arg He 

50 55 60 

gtc aag aca cgc gac ate gac gag gtg ctg caa cga gtg ggc cac gec 240 

Val Lys Thr Arg Asp He Asp Glu Val Leu Gin Arg Val Gly His Ala 

65 70 75 80 



cgc gac cgc cag gee age ctg gcg ctg cag etc gac gac gec gtg cgc 288 
Arg Asp Arg Gin Ala Ser Leu Ala Leu Gin Leu Asp Asp Ala Val Arg 
85 90 95 



cgt cac gaa cat gaa ate cag ctg etc gcg cag cag cgc gag cag cac 336 
Arg His Glu His Glu He Gin Leu Leu Ala Gin Gin Arg Glu Gin His 
100 105 110 

egg gag tgc ttc cag gcg cag caa egg ate gec gag ttg gtg cgc ctg 384 
Arg Glu Cys Phe Gin Ala Gin Gin Arg He Ala Glu Leu Val Arg Leu 
115 120 125 

cag cag gtc gag gcg gcg gee ttg cgc gag age cag gaa gat cgc gaa 4 32 

Gin Gin Val Glu Ala Ala Ala Leu Arg Glu Ser Gin Glu Asp Arg Glu 
130 135 140 

att cag gaa gec ate gaa ttg teg gcg cgt ggg cgc gac gat gca teg 480 
He Gin Glu Ala He Glu Leu Ser Ala Arg Gly Arg Asp Asp Ala Ser 
145 150 155 160 

cga gec ggc gac ggc ctg gcg egg eta tga 510 
Arg Ala Gly Asp Gly Leu Ala Arg Leu * 
50 165 

<210> 24 

<211> 169 

55 <212> PRT 

<213> Bordetella pertussis 



60 



<400> 24 

Met Asp Leu Glu Ser Leu Leu Ala He Lys His Phe Arg Ala Asp Gin 
1 5 10 15 
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Pro Asp Thr Thr Lea Ser Val Arg Glu Asp Gly Gly Trp lie Val Val 

115 120 125 

get ttc gca tgc cga caa egg gac get tgc gag cgc ctg cac gcg tgc 432 

5 Ala Phe Ala Cys Arg Gin Arg Asp Ala Cys Glu Arg Leu His Ala Cys 

130 135 140 

gec gac egg ttg gee atg gag etc gcg ctg gag ctg gcg cgc gac gtc 480 

Ala Asp Arg Leu Ala Met Glu Leu Ala Leu Glu Leu Ala Arg Asp Val 

10 145 150 155 160 
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gcg cag egg ccg tgg cga tga 549 
Ala Gin Arg Pro Trp Arg 
180 

<210> 26 
<211> 182 
<212> PRT 

<213> Bordetella pertussis 





<400> 


26 


























Met 


Asn 


Gin 


Pro 


Asp 


Gly 


Leu 


Gly 


Ser 


Pro 


Met 


ala 


Gly 


Gly 


Gly 


Gin 


1 








5 










10 








15 




Arg 


Met 


Gly 


Val 


Ala 


Arg 


Thr 


Pro 


Tyr 


Ala 


Arg 


Gin 


Pro 


Asp 


Arg 


Asp 








20 










25 










30 


Ala 


Gin 


Arg 


Ala 


Phe 


Glu 


Arg 


Glu 


Met 


Glu 


Gin 


Glu 


Lys 


Ala 


Lys 


Glu 






35 










40 










45 






Glu 


Leu 


Pro 


Gly 


Pro 


Gin 


Arg 


Leu 


Ala 


Pro 


Gly 


Pro 


Ala 


Cys 


Val 


Gly 




50 










55 








60 






Trp 


Leu 


Ala 


Ser 


Met 


Glu 


Pro 


Ala 


Ala 


Gly 


Arg 


Pro 


Pro 


Ala 


Ser 


Leu 


65 










70 








75 










80 


Ala 


Gin 


Ala 


Leu 


Ala 


Ser 


Val 


Ala 


Ala 


Gly 


Leu 


Ala 


Val 


Gly 


Asp 


Val 










85 










90 








95 




Leu 


Glu 


Gly 


Tyr 


Arg 


Glu 


Ala 


Arg 


He 


Val 


Val 


Asp 


Asp 


Thr 


Leu 


Leu 








100 










105 






110 






Pro 


Asp 


Thr 


Thr 


Leu 


Ser 


Val 


Arg 


Glu 


Asp 


Gly 


Gly 


Trp 


He 


Val 


Val 






115 










120 








125 








Ala 


Phe 


Ala 


Cys 


Arg 


Gin 


Arg 


Asp 


Ala 


Cys 


Glu 


Arg 


Leu 


His 


Ala 


Cys 




130 










135 










140 








Ala 


Asp 


Arg 


Leu 


Ala 


Met 


Glu 


Leu 


Ala 


Leu 


Glu 


Leu 


Ala 


Arg 


Asp 


Val 


145 










150 










155 






160 


Glu 


Val 


Ala 


Val 


Ala 


Cys 


Asp 


Gly 


Glu 


Pro 


His 


Glu 


Arg 


Val 


Ala 


Arg 










165 










170 








175 


Ala 


Gin 


Arg 


Pro 


Trp 


Arg 




























180 




























<210> 


27 




























<211> 


1080 


























<212> 


DNA 



























55 <213> Bordetella pertussis 



60 



<220> 

<221> CDS 

<222> (1) . . . (1080) 



28 



WO 00/37493 



PCT/EP99/10297 



<400> 27 

atg aat cga gtg gcc ggc ggg gcg gcg gcg cag gcc get ggc atg gtg 48 

Met Asn Arg Val Ala Gly Gly Ala Ala Ala Gin Ala Ala Gly Met Val 

15 10 15 

5 

gat etc gcg gtt ccg egg ttg age gcc ggc gag gcc cat gcc ctg .teg 96 

Asp Leu Ala Val Pro Arg Leu Ser Ala Gly Glu Ala His Ala Leu Ser 

20 25 30 

10 agg att gca tgc cat ggc gcg cga ttc gac gtt egg ctt ggc gag ccg 144 

Arg lie Ala Cys His Gly Ala Arg Phe Asp Val Arg Leu Gly Glu Pro 

35 40 45 

gcc gtg cgc tgg cac tgc gcc ctg acg cct tgc gtg cac ggc gac ctt 192 

15 Ala Val Arg Trp His Cys Ala Leu Thr Pro Cys Val His Gly Asp Leu 

50 55 60 

gcc gat ggc gag atg gaa age ctg caa ctg caa tgg gcc ggg acg tac 240 

Ala Asp Gly Glu Met Glu Ser Leu Gin Leu Gin Trp Ala Gly Thr Tyr 

20 65 70 75 80 

ate ggc ctg acg gtt ccg cgc gcg gcc gcg gcg gga tgg ctg gcg gcg 288 

lie Gly Leu Thr Val Pro Arg Ala Ala Ala Ala Gly Trp Leu Ala Ala 

85 90 95 
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cgc ctg ccc egg ttt tec ggc gtg gag ttg ccg gaa ccc att gcg gcg 336 
Arg Leu Pro Arg Phe Ser Gly Val Glu Leu Pro Glu Pro lie Ala Ala 
100 105 110 



30 gcg gcc ctg gag gca atg ctg gag gag gtc tgt cga ggc gtg gcc gga 384 

Ala Ala Leu Glu Ala Met Leu Glu Glu Val Cys Arg Gly Val Ala Gly 
115 120 125 

etc gac cag caa ggc ccg gtc cgc gtg gcg egg caa ggc ggg acg cca 432 

35 Leu Asp Gin Gin Gly Pro Val Arg Val Ala Arg Gin Gly Gly Thr Pro 
130 135 140 

ccg gtc cag ccg cat cgc tgg acc ctg acg gta egg gcg cct gac ggt 480 

Pro Val Gin Pro His Arg Trp Thr Leu Thr Val Arg Ala Pro Asp Gly 

40 145 150 155 160 

ggc gtc tgg cgc gcg gta ctg gcg tgc gac gca tgg gcc ttg caa gcg 528 

Gly Val Trp Arg Ala Val Leu Ala Cys Asp Ala Trp Ala Leu Gin Ala 

165 170 175 



gtc gcg gcg gcg ctg gat tec gtt gcg cct gcc gat ggt egg gtc aat 57 6 

Val Ala Ala Ala Leu Asp Ser Val Ala Pro Ala Asp Gly Arg Val Asn 
180 185 190 



50 ccg gag cgc gtg ccg gtc agg ttg cgt gcc gat gtc ggc gcg gcg tec 624 

Pro Glu Arg Val Pro Val Arg Leu Arg Ala Asp Val Gly Ala Ala Ser 

195 200 205 

gtg acc gca ggc cag ctg egg acg ctg cga gcg ggc gac gtc gtg ttg 672 

55 Val Thr Ala Gly Gin Leu Arg Thr Leu Arg Ala Gly Asp Val Val Leu 

210 215 220 

etc gcg cag tac egg gtg age gat gcc gca gaa eta tgg ttg teg gcc 720 

Leu Ala Gin Tyr Arg Val Ser Asp Ala Ala Glu Leu Trp Leu Ser Ala 

60 225 230 235 240 



29 



WO 00/37493 



PCT/EP99/10297 



gga ccc age gcg ate egg g~a egg gee gag cat gcg tct ttt cgt gta 768 

Gly Pro Ser Ala He Arg Val Arg Ala Glu His Ala Ser Phe Arg Val 
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act caa ggt tgg act ccc arc atg acg gaa ccc gcg aca cct gac cct 816 

Thr Gin Gly Trp Thr Pro He Met Thr Glu Pro Ala Thr Pro Asp Pro 
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ggc gaa acc ccg gca cag gec gac gcg acg etc gat ace gat cag ata 864 

Gly Glu Thr Pro Ala Gin Ala Asp Ala Thr Leu Asp Thr Asp Gin He 
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ccc gtg cgc ctg acg ttc gac ctg ggc gag cgc gag ttc acg ctt gcg 912 

Pro Val Arg Leu Thr Phe Asp Leu Gly Glu Arg Glu Phe Thr Leu Ala 
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cag ctg cgc age ctg cat ccg ggc tgc acg ttc gac etc gag egg ccc 960 

Gin Leu Arg Ser Leu His Pro Gly Cys Thr Phe Asp Leu Glu Arg Pro 

305 310 315 320 

ate gee gac ggg ccg gtc atg gtg egg gec aat ggc ctg ttg ctg ggc 1008 

He Ala Asp Gly Pro Val Met Val Arg Ala Asn Gly Leu Leu Leu Gly 
325 330 335 

age ggc egg ctg gtc gac ate gac ggc cgc ate ggc gtg gta ttg cag 1056 

Ser Gly Arg Leu Val Asp He Asp Gly Arg He Gly Val Val Leu Gin 
340 345 350 

teg gtc agg cct gga etc gca tga 1080 

Ser Val Arg Pro Gly Leu Ala * 
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gcc gtg ctt ggc gtg gcc gag cga ggc gtg ggg ccg ctg egg gec ttc 

Ala Val Leu Gly Val Ala Glu Arg Gly Val Gly Pro Leu Arg Ala Phe 

100 105 110 

atg ttg cgc aac age cag ccg gcc cag cgt gat ttc ttc ctg cgc aca 

Met Leu Arg Asn Ser Gin Pro Ala Gin Arg Asp Phe Phe Leu Arg Thr 

115 120 125 
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10 gcg cgt cat etc tgg ggc gag gag gca teg egg gac ctg teg gaa gac 432 

Ala Arg His Leu Trp Gly Glu Glu Ala Ser Arg Asp Leu Ser Glu Asp 

130 135 140 

aac ctg ctg gta ttg acg ccc gca ttt ctg gtt teg gag ctg acc gcc 480 

15 Asn Leu Leu Val Leu Thr Pro Ala Phe Leu Val Ser Glu Leu Thr Ala 
145 150 155 160 

gca ttc cag ctt ggc ttt ctg ctg tac ctg ccg ttc ate ate ate gac 528 

Ala Phe Gin Leu Gly Phe Leu Leu Tyr Leu Pro Phe lie lie lie Asp 

20 165 170 175 

etc ate gta teg aac att ctt ctt gcc atg gga atg atg atg gtt tct 576 

Leu lie Val Ser Asn lie Leu Leu Ala Met Gly Met Met Met Val Ser 

180 185 190 



ccc gtg acg ate tec atg ccg ttg aag ctg ttc ctg ttc gtc atg gtg 624 
Pro Val Thr lie Ser Met Pro Leu Lys Leu Phe Leu Phe Val Met Val 
195 200 205 



30 gac ggc tgg acg cgc ctg ate cag ggc ctg gtg ctt tec tat egg tga 672 
Asp Gly Trp Thr Arg Leu lie Gin Gly Leu Val Leu Ser Tyr Arg * 
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Phe Asp Ala Phe His Arg lie His 
85 

<210> 33 

5 <211> 801 

<212> DNA 

<213> Bordetella pertussis 
<220> 

10 <221> CDS 

<222> (1) . . . (801) 

<400> 33 

atg cac acg gag ttc aat ttc gtc gag gcg aag gtt ttc ctg gga acg 48 

15 Met His Thr Glu Phe Asn Phe Val Glu Ala Lys Val Phe Leu Gly Thr 
15 10 15 

ctg gcc atg acg caa ccg egg ata etc acg gec atg etc ttt ctg ccg 96 
Leu Ala Met Thr Gin Pro Arg lie Leu Thr Ala Met Leu Phe Leu Pro 
20 20 25 30 

atg ttc aac cgt cag ttt ctg cet ggt ccg ctg cgt tac gcc gtc ggc 144 
Met Phe Asn Arg Gin Phe Leu Pro Gly Pro Leu Arg Tyr Ala Val Gly 



25. 



45 



50 



35 40 45 

gcc tgt etc ggg ctg ate gtg gtt ccc cag ctg gcg ccg cag tat gcc 192 

Ala Cys Leu Gly Leu lie Val Val Pro Gin Leu Ala Pro Gin Tyr Ala 

50 55 60 



30 gcg ctg gat ate gac tgg ccc egg ctg ctg gcg ctg ctg gcc aag gag 240 

Ala Leu Asp He Asp Trp Pro Arg Leu Leu Ala Leu Leu Ala Lys Glu 

65 70 75 80 

gcg atg gtg ggc atg ttc ctg ggt tgg ctg get gcc ttg cca ttc tgg 288 

35 Ala Met Val Gly Met Phe Leu Gly Trp Leu Ala Ala Leu Pro Phe Trp 

85 90 95 

ate ttc gag gcc ate ggc ttc gtc ata gac aac caa egg ggc gcc age 336 

He Phe Glu Ala He Gly Phe Val lie Asp Asn Gin Arg Gly Ala Ser 
40 100 105 110 

ctg ggc get ate etc aac ccc gcc acg ggc aac gat teg teg ccc atg 384 

Leu Gly Ala He Leu Asn Pro Ala Thr Gly Asn Asp Ser Ser Pro Met 
115 120 125 



ggc att etc ttc aat ctg gga ttc atg gtg ttc ttc ctg acg gcg ggc 432 
Gly He Leu Phe Asn Leu Gly Phe Met Val Phe Phe Leu Thr Ala Gly 
130 135 140 



gga ttc ggg ttg ttc gcc acg atg ctg tat gac age ttc ggg ttg tgg 
Gly Phe Gly Leu Phe Ala Thr Met Leu Tyr Asp Ser Phe Gly Leu Trp 
145 150 155 160 



480 



aac ate tgg gcg tgg tgg ccg tec atg ccc gca cag ggc gcc gtg egg 528 

55 Asn He Trp Ala Trp Trp Pro Ser Met Pro Ala Gin Gly Ala Val Arg 

165 170 175 

atg ctg gac cag ttc agt ggc ttt gcc gcg cgt gtc ctg ctg ctg gcc 576 

Met Leu Asp Gin Phe Ser Gly Phe Ala Ala Arg Val Leu Leu Leu Ala 

60 180 185 190 
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teg ccg gec ate gtg gec atg ttc ctg gec gag ctg ggc ctg gec ctg 624 
Ser Pro Ala lie Val Ala Met ?he Leu Ala Glu Leu Gly Leu Ala Leu 
195 200 205 

5 

ate age cgc ttc gcg cct caa ctg cag gtg ttc ttc ctg get ctg ccg 672 
lie Ser Arg Phe Ala Pro Gin Leu Gin Val Phe Phe Leu Ala Leu Pro 
210 215 220 

10 gta aag age gcg ctg gtg ctg ttc gtg ctg gtg ctg tac atg gca acg 720 
Val Lys Ser Ala Leu Val Leu Phe Val Leu Val Leu Tyr Met ala Thr 
225 230 235 240 

ttg ttc cag tat gca ggc gaa ate ctg ggt tct gtg ggc egg ate gtg 768 
15 Leu Phe Gin Tyr Ala Gly Glu lie Leu Gly Ser Val Gly Arg lie Val 

245 250 255 

ccg ttc ctg cat tea gcg tgg ccc ggc cca tga 801 
Pro Phe Leu His Ser Ala Trp Pro Gly Pro * 
20 260 265 



<210> 34 
<211> 266 
25 <212> PRT 

<213> Bordetella pertussis 

<400> 34 

Met His Thr Glu Phe Asn Phe Val Glu Ala Lys Val Phe Leu Gly Thr 
30 1 5 10 15 

Leu Ala Met Thr Gin Pro Arg lie Leu Thr Ala Met Leu Phe Leu Pro 

20 25 30 

Met Phe Asn Arg Gin Phe Leu Pro Gly Pro Leu Arg Tyr Ala Val Gly 
35 40 45 

35 Ala Cys Leu Gly Leu lie Val Val Pro Gin Leu Ala Pro Gin Tyr Ala 
50 55 60 

Ala Leu Asp lie Asp Trp Pro Arg Leu Leu Ala Leu Leu Ala Lys Glu 
65 70 75 80 

Ala Met Val Gly Met Phe Leu Gly Trp Leu Ala Ala Leu Pro Phe Trp 
40 85 90 95 

lie Phe Glu Ala lie Gly Phe Val lie Asp Asn Gin Arg Gly Ala Ser 

100 105 110 

Leu Gly Ala lie Leu Asn Pro Ala Thr Gly Asn Asp Ser Ser Pro Met 
115 120 125 

45 Gly lie Leu Phe Asn Leu Gly Phe Met Val Phe Phe Leu Thr Ala Gly 
130 135 140 

Gly Phe Gly Leu Phe Ala Thr Met Leu Tyr Asp Ser Phe Gly Leu Trp 
145 150 155 160 

Asn lie Trp Ala Trp Trp Pro Ser Met Pro Ala Gin Gly Ala Val Arg 
50 165 170 175 

Met Leu Asp Gin Phe Ser Gly Phe Ala Ala Arg Val Leu Leu Leu Ala 

180 185 190 

Ser Pro Ala lie Val Ala Met Phe Leu Ala Glu Leu Gly Leu Ala Leu 
195 200 205 

55 lie Ser Arg Phe Ala Pro Gin Leu Gin Val Phe Phe Leu Ala Leu Pro 
210 215 220 

Val Lys Ser Ala Leu Val Leu Phe Val Leu Val Leu Tyr Met ala Thr 
225 230 235 240 

Leu Phe Gin Tyr Ala Gly Glu lie Leu Gly Ser Val Gly Arg He Val 
60 245 250 255 
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48 
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144 



192 



240 



288 



336 



384 



432 



480 

145 150 155 160 

tec ttg cag ccg ctg atg gec gtt ccc cat age ggg ctg gac ggg ttg 528 

55 Ser Leu Gin Pro Leu Met ala Val Pro His Ser Gly Leu Asp Gly Leu 

165 170 175 

cga acg ggc gta ggc cgc att ctg cag gtc atg gtc tgg aac ate gga 576 

Arg Thr Gly Val Gly Arg He Leu Gin Val Met Val Trp Asn He Gly 
60 180 185 190 
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25 



30 



55 



60 



ctg gcg tac ggg gcg att teg ctg gcg gac ctg gec tgg cag cgt tac 624 
Leu Ala Tyr Gly Ala lie Ser Leu Ala Asp Leu Ala Trp Gin Arg Tyr 
195 200 205 

cag tat cgc aaa ggc ttg egg atg age aag gac gaa gtg aag cag gag 672 
Gin Tyr Arg Lys Gly Leu Arg Met Ser Lys Asp Glu Val Lys Gin Glu 
210 215 220 

tac aag gag atg gaa ggc gat ccc cat ate aag cag caa cgc aag cac 720 
Tyr Lys Glu Met Glu Gly Asp Pro His He Lys Gin Gin Arg Lys His 
225 230 235 240 

ctg cac cag gag ctg ate atg cat ggc gcg gcg gec cag gtt cgc egg 768 
Leu His Gin Glu Leu He Met His Gly Ala Ala Ala Gin Val Arg Arg 
245 250 255 



gcg acg gtg ctg gtg ace aat ccg aca cac ctg gec gtg gec ctg tac 816 
Ala Thr Val Leu Val Thr Asn Pro Thr His Leu Ala Val Ala Leu Tyr 
20 260 265 270 



tac gcg gcg ggc gag acg ccc rtg ccg cgc gtg ctg gec atg ggg cag 864 

Tyr Ala Ala Gly Glu Thr Pro Leu Pro Arg Val Leu Ala Met Gly Gin 

275 280 285 

gga gec gtg gee get etc atg gtc gag gee gcg cgc gat gee ggc gtg 912 

Gly Ala Val Ala Ala Leu Met Val Glu Ala Ala Arq Asp Ala Gly Val 

290 295 300 

ccg gtc atg cag aac gtc gcg ctg gee cgc gec ttg cac gac cag gcg 960 

Pro Val Met Gin Asn Val Ala Leu Ala Arg Ala Leu His Asp Gin Ala 

305 310 315 320 



gag gtg gac caa tac att ccc ggc gag ttg gtg gag ccg gtg gec gcg 1008 
JO Glu Val Asp Gin Tyr He Pro Gly Glu Leu Val Glu Pro Val Ala Ala 

325 330 335 

gtg ttg egg gcg gtg cgc cag gca etc aag gag cag aca tga 1050 
Val Leu Arg Ala Val Arg Gin Ala Leu Lys Glu Gin Thr * 
40 340 345 

<210> 36 
<211> 349 
45 <212> PRT 

<213> Bordetella pertussis 
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15 



Val Gly Met Phe Ala Glu Phe Leu Gin Val Gly Val Val Leu Ala Phe 

100 105 110 

Arg Lys Leu Lys Pro Ser Ala Glu Lys Leu Asn Pro Ala Gly Asn Leu 
. H5 120 125 

D Lys Asn He Phe Ser Ala Arg Asn Leu Met Glu Phe He Lys Ser Val 
130 135 140 

Cys Lys He Leu Phe Leu Ala Val Leu Val Thr Leu Val He Arq Asp 
145 150 155 160 

Ser Leu Gin Pro Leu Met ala Val Pro His Ser Gly Leu Asp Gly Leu 
10 165 170 175 

Arg Thr Gly Val Gly Arg lie Leu Gin Val Met Val Trp Asn He Gly 

180 185 190 

Leu Ala Tyr Gly Ala He Ser Leu Ala Asp Leu Ala Trp Gin Arq Tvr 

195 200 205 

Gin Tyr Arg Lys Gly Leu Arg Met Ser Lys Asp Glu Val Lys Gin Glu 

210 215 220 

Tyr Lys Glu Met Glu Gly Asp Pro His He Lys Gin Gin Arg Lys His 
225 230 235 240 

Leu His Gin Glu Leu He Met His Gly Ala Ala Ala Gin Val Arg Arq 
20 245 250 255 

Ala Thr Val Leu Val Thr Asn Pro Thr His Leu Ala Val Ala Leu Tyr 

260 265 270 

Tyr Ala Ala Gly Glu Thr Pro Leu Pro Arg Val Leu Ala Met Gly Gin 

275 280 285 

Gly Ala Val Ala Ala Leu Met Val Glu Ala Ala Arg Asp Ala Gly Val 

290 295 300 

Pro Val Met Gin Asn Val Ala Leu Ala Arg Ala Leu His Asp Gin Ala 
305 310 315 320 

Glu Val Asp Gin Tyr He Pro Gly Glu Leu Val Glu Pro Val Ala Ala 
30 325 330 335 

Val Leu Arg Ala Val Arg Gin Ala Leu Lys Glu Gin Thr 
340 345 

<210> 37 
35 <211> 399 

<212> DNA 

<213> Bordetella pertussis 
<220> 

40 <221> CDS 

<222> (1) . . . (399) 



25 



45 



55 



60 



<400> 37 

atg aca gca ace att cat ccc gat att gcc gat tat gcg cga cgc cat 4 8 

Met Thr Ala Thr He His Pro Asp He Ala Asp Tyr Ala Arg Arg His 

1 . 5 10 15 



ggc etc gaa ccc teg gtc gac gcc gat ggc ggg ctt gcc gtc egg ate 96 
Gly Leu Glu Pro Ser Val Asp Ala Asp Gly Gly Leu Ala Val Arg He 
50 20 25 30 



gac gga egg cat cgc gtc agg ttg ate ccc gcc gaa gac ggc atg ctg 14 4 

Asp Gly Arg His Arg Val Arg Leu He Pro Ala Glu Asp Gly Met Leu 

35 40 45 

gtg ttg egg gcg egg ctg gcc gag ctg ccc gat ggg tgg cag gcg cgc 192 

Val Leu Arg Ala Arg Leu Ala Glu Leu Pro Asp Gly Trp Gin Ala Arg 

50 55 60 

gcg gcg cag ttg cgc egg gcg ggc ctg ctg gcc age gcc atg gcc cct 24 0 
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Ala Ala Gin Leu Arg Arg Ala Gly Leu Leu Ala Ser Ala Met ala Pro 

65 70 75 80 

gcg acc gat: gcg tac tgc ggc ata gac cag ggc gaa acc gcg ttg tat 288 

5 Ala Thr Asp Ala Tyr Cys Gly lie Asp Gin Gly Glu Thr Ala Leu Tyr 

85 90 95 

ctg cac cag cgc gtc gca ccg gcc ggc agt gcg ctg gcg gtg gac gag 336 

Leu His Gin Arg Val Ala Pro Ala Gly Ser Ala Leu Ala Val Asp Glu 

10 100 105 110 

gcg gtg ggc gag ttc gtc aat gcc ttg gcc act tgg aaa agg gcg atg 384 

Ala Val Gly Glu Phe Val Asn Ala Leu Ala Thr Trp Lys Arg Ala Met 
115 120 125 



15 



20 



25 



50 



gcg caa tgg caa tag 399 
Ala Gin Trp Gin * 
130 

<210> 38 
<211> 132 
<212> PRT 

<213> Borcietella pertussis 



<400> 38 

Met Thr Ala Thr lie His Pro Asp He Ala Asp Tyr Ala Arg Arg His 

15 10 15 

Gly Leu Glu Pro Ser Val Asp Ala Asp Gly Gly Leu Ala Val Arg He 
30 20 25 30 

Asp Gly Arg His Arg Val Arg Leu He Pro Ala Glu Asp Gly Met Leu 

35 40 45 

Val Leu Arg Ala Arg Leu Ala Glu Leu Pro Asp Gly Trp Gin Ala Arg 
50 55 60 

35 Ala Ala Gin Leu Arg Arg Ala Gly Leu Leu Ala Ser Ala Met ala Pro 
65 70 75 80 

Ala Thr Asp Ala Tyr Cys Gly He Asp Gin Gly Glu Thr Ala Leu Tyr 

85 90 95 

Leu His Gin Arg Val Ala Pro Ala Gly Ser Ala Leu Ala Val Asp Glu 
40 100 105 110 

Ala Val Gly Glu Phe Val Asn Ala Leu Ala Thr Trp Lys Arg Ala Met 

115 120 125 

Ala Gin Trp Gin 
130 

45 

<210> 39 
<211> 603 
<212> DNA 

<213> Bordetella pertussis 

<220> 

<221> CDS 

<222> (1) . . . (603) 

55 <400> 39 

atg gtt tct ccc ccg tea tec ggt ctt ccc get tct etc gaa aaa ccg 4 8 

Met Val Ser Pro Pro Ser Ser Gly Leu Pro Ala Ser Leu Glu Lys Pro 
1 5 10 15 



60 



gac aac gca tat ccc gat ate gcc acc gag cga teg gac cag cag ttg 96 
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Asp Asn Ala Tyr Pro Asp He Ala Thr Glu Arg Ser Asp Gin Gin Leu 

20 25 30 

ctg age age ctg gta gec gaa cat gec ggc cga tta cag aga ttc ate 144 

5 Leu Ser Ser Leu Val Ala Glu His Ala Gly Arg Leu Gin Arg Phe He 

35 40 45 

gec aag cac ate ggc cac age age gac gtc gag gac ctt gcg cag cag 192 

Ala Lys His lie Gly His Ser Ser Asp Val Glu Asp Leu Ala Gin Gin 

10 50 55 60 

get ttc gee gag gcg gcg cgc gcg tat caa teg ttc cgt ggc gac tec 240 

Ala Phe Ala Glu Ala Ala Arg Ala Tyr Gin Ser Phe Arg Gly Asp Ser 

65 70 75 80 



15 



35 



cag ctt tec acc tgg ctg tac ggc ate gcg etc aat ctg gtc cgc aat 288 
Gin Leu Ser Thr Trp Leu Tyr Gly He Ala Leu Asn Leu Val Arg Asn 
85 90 95 



20 cac ttg teg cgt gcg cca gag cgc cgt tat gaa ttc acc tec gac gee 336 

His Leu Ser Arg Ala Pro Glu Arg Arg Tyr Glu Phe Thr Ser Asp Ala 
100 105 110 

age ctg ggt gtc atg cca tgc agt gcg ccc aac ccc gaa gec gtg acc 384 

25 Ser Leu Gly Val Met Pro Cys Ser Ala Pro Asn Pro Glu Ala Val Thr 
115 120 125 

gag cag cgt caa cgc atg cgc ttg eta cgc gaa gcg ctg gag cag etc 432 

Glu Gin Arg Gin Arg Met Arg Leu Leu Arg Glu Ala Leu Glu Gin Leu 
30 130 135 140 

ccc gaa age atg cgc gac gtg ate etc atg gtc ggc gtg gaa gaa etc 4 80 

Pro Glu Ser Met Arg Asp Val lie Leu Met Val Gly Val Glu Glu Leu 

145 150 155 160 



tec tat gaa gag get gec gca ctg ctt teg gtt cct gta gga acc att 528 
Ser Tyr Glu Glu Ala Ala Ala Leu Leu Ser Val Pro Val Gly Thr He 
165 170 175 



40 cgc age cga ctt tec cgc gec cgc tgt gee ttg cgc gaa gcg ctg cgc 576 

Arg Ser Arg Leu Ser Arg Ala Arg Cys Ala Leu Arg Glu Ala Leu Arg 

180 185 190 

gaa cga ggc tac gac age gtg ccg tag 603 

45 Glu Arg Gly Tyr Asp Ser Val Pro * 

195 200 



50 



55 



60 
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Met Thr Arg He Asp Ala Ala Pro Asn Pro Phe His Ala Ala Met Gin 
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cag cgc ate gca ccg gcg ccc acc ggc ata teg ctg gcg gac gcg gec 144 
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cag gac gcg ctg gac aac gcg ata gca atg gag aac gca tga 1098 
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Asp Glu Ser Ala Arg Pro Arg Leu Thr Asp Gly Ala Leu Phe Gin Pro 

65 70 75 80 

10 gcg cag ttc gag cgc gcc ctg gcc cag gcg cgc gac gaa ctg tec egg 288 
Ala Gin Phe Glu Arg Ala Leu Ala Gin Ala Arg Asp Glu Leu Ser Arg 
85 90 95 

gcc atg gaa ctg cat gcc ggc aac acc gcg cca gcc tta age cgc gcc 336 
15 Ala Met Glu Leu His Ala Gly Asn Thr Ala Pro Ala Leu Ser Arg Ala 

100 105 110 

ttg cac gta etc aac gag gcc gga aag ctg cgc gac ctg get gcc atg 384 
Leu His Val Leu Asn Glu Ala Gly Lys Leu Arg Asp Leu Ala Ala Met 
20 115 120 125 

tat cgc age gcg etc tac cag gga tga 411 
Tyr Arg Ser Ala Leu Tyr Gin Gly * 
130 135 

25 
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<400> 49 

atg aat act gcc gat agg gcg ctg cat cag ttc ggc cag gat ate ggc 48 

Met Asn Thr Ala Asp Arg Ala Leu His Gin Phe Gly Gin Asd lie Gly 

1 5 10 15 

5 

ate gag ggc ctg gca ttc ggg ccg tec gga teg gcg teg ctg gcg ctg 96 

lie Glu Gly Leu Ala Phe Gly Pro Ser Gly Ser Ala Ser Leu Ala Leu 
20 25 30 

10 tec aac ggg cgc cgc ctg ggc gtc gaa tgc gtc gcc ggc gcg gcc ctg 144 

Ser Asn Gly Arg Arg Leu Gly Val Glu Cys Val Ala Gly Ala Ala Leu 
35 40 45 

gtc cac ctg gcc cag egg gtc gag cgc gac gcc gcc tec gtg ttg ctg 192 

15 Val His Leu Ala Gin Arg Val Glu Arg Asp Ala Ala Ser Val Leu Leu 
50 55 60 

gcg gca tgg aaa egg gcc cat ggg cag cgc gga age gcc gca tec ate 240 

Ala Ala Trp Lys Arg Ala His Gly Gin Arg Gly Ser Ala Ala Ser lie 

20 65 70 75 80 

cag acg tea etc tgg teg gag ggc age gag gac tgg ate gtc gcg cag 288 

Gin Thr Ser Leu Trp Ser Glu Gly Ser Glu Asp Trp lie Val Ala Gin 

85 90 95 
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aca cga ctg ccc gaa cgc teg etc gac gca gcg gcg ttg cgc ctg gcg 336 
Thr Arg Leu Pro Glu Arg Ser Leu Asp Ala Ala Ala Leu Arg Leu Ala 
100 105 110 



30 gtg ctg ggc ctg acg aac tgg etc gac cgc ctg gag gcg tga 378 
Val Leu Gly Leu Thr Asn Trp Leu Asp Arg Leu Glu Ala * 
115 120 125 
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Val Lys Gly Met Leu Asp Thr Ala Ser Asn Thr Gin Gin Met Asp Met 

210 215 220 

ate agg ctg cag gec gec age aac aag cgc aac gag get ttc gag gtc 720 

5 lie Arg Leu Gin Ala Ala Ser Asn Lys Arg Asn Glu Ala Phe Glu Val 

225 230 235 240 

atg acc aac acc gag aag egg cgc age gac ttg aac age tec ate acc 768 

Met Thr Asn Thr Glu Lys Arg Arg Ser Asp Leu Asn Ser Ser lie Thr 

10 245 250 255 



15 



age aac atg cgc taa 783 
Ser Asn Met Arg * 
260 



<210> 52 
<211> 260 
<212> PRT 

20 <213> Bordetella pertussis 

<400> 52 

Met Gly Ser Pro Arg Arg Arg Asn His Leu Pro Thr Gly Ala Val Ser 
1 5 10 15 

Z5 Val Ala Arg Ala Val Met Val Pro Gly Asn Gly Arg Asp lie Gly Gin 

20 25 30 

Phe Ala Ala Trp Asn Leu Pro Arg Ala Gin Gly Tyr Ser Ala Cys Val 

35 40 45 

Phe Gin Leu Glu Gly Ala Leu Met Ser lie Asp Leu Gly Val Ser Leu 
30 50 55 60 

Thr Ser Gin Ala Gly Gly Leu Gin Gly lie Asp Leu Lys Ser Met Asp 
65 70 75 80 

lie Gin Thr Leu Met Val Tyr Val Gin Gly Arg Arg Ala Glu Leu Leu 
85 90 95 

35 Thr Ala Gin Met Gin Thr Gin Ala Glu Val Val Gin Lys Ala Asn Glu 

100 105 110 

Arg Met ala Gin Leu Asn Glu Val Leu Ser Ala Leu Ser Arg Ala Lys 

115 120 125 

Ala Glu Phe Pro Pro Asn Pro Lys Pro Gly Asp Thr lie Pro Gly Trp 
40 130 135 140 

Asp Ser Gin Lys lie Ser Arg lie Glu Val Pro Leu Asn Asp Ala Leu 
145 150 155 160 

Arg Ala Ala Gly Leu Thr Gly Met Phe Glu Ala Arg Asp Gly Arg Val 
A . 165 170 175 

45 Thr Gly Pro Asp Gly Arg Gly Thr Gin Val Val Asn Gly Thr Gly Val 

180 185 190 

Met ala Gly Ser Thr Thr Tyr Lys Glu Leu Glu Ser Ala Tyr Thr Thr 

195 200 205 

Val Lys Gly Met Leu Asp Thr Ala Ser Asn Thr Gin Gin Met Asp Met 
50 210 215 220 

lie Arg Leu Gin Ala Ala Ser Asn Lys Arg Asn Glu Ala Phe Glu Val 
225 230 235 240 

Met Thr Asn Thr Glu Lys Arg Arg Ser Asp Leu Asn Ser Ser lie Thr 
c _ 245 250 255 

->5 Ser Asn Met Arg 

260 
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His Val Gly Val Asp Pro Ala Arg Leu Arg Asn Leu Ala Val Glu Gin 
35 40 45 



20 gee agg ata gag gec gag gec cag gcg gcg ttc cgt gat gac etc gcg 192 
Ala Arg He Glu Ala Glu Ala Gin Ala Ala Phe Arg Asp Asp Leu Ala 
50 55 60 

gac ate gag cgc gag gcg gcg cgc gtc aag gcg gec tgc acc gat gcg 240 
25 Asp He Glu Arg Glu Ala Ala Arg Val Lys Ala Ala Cys Thr Asp Ala 
65 70 75 80 

ccg cag gec cgc agg gtg ctt cac aac cac gtc tga 276 
Pro Gin Ala Arg Arg Val Leu His Asn His Val * 
30 85 90 
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<400> 55 

atg tct gtt tct ccg act teg ccc ggc tct ttc ggg gec ggc cct gtc 48 

Met Ser Val Ser Pro Thr Ser Pro Gly Ser Phe Gly Ala Gly Pro Val 

1 5 10 15 

ttt gac tec gaa ttg cag gec ccg gec ccg teg gcg cag cgt cgc ggc 96 

Phe Asp Ser Glu Leu Gin Ala Pro Ala Pro Ser Ala Gin Arg Arg Gly 

20 25 30 

ggt gcg gcg cct gtg ccg ccg ccc gtc gat egg cgc ggc gtc gag ccg 144 

Gly Ala Ala Pro Val Pro Pro Pro Val Asp Arg Arg Gly Val Glu Pro 

35 40 45 

gga gat ccc acg ctg ggc atg ctg ccc gcg cca gat ttg etc gcg ggg 192 

Gly Asp Pro Thr Leu Gly Met Leu Pro Ala Pro Asp Leu Leu Ala Gly 

50 55 60 



ggc gec gtc age cgc acc cgc gcg gcg etc gac gat ctg gac gec gca 240 
Gly Ala Val Ser Arg Thr Arg Ala Ala Leu Asp Asp Leu Asp Ala Ala 

75 80 



20 65 70 



egg etc ggt gaa gac ate tac gee ttg atg gcg gtg ttg caa cag gec 288 

Arg Leu Gly Glu Asp lie Tyr Ala Leu Met ala Val Leu Gin Gin Ala 
85 90 95 

agt cag cag atg egg gac gee gec cgt ate get cgt gat gec gag get 336 

Ser Gin Gin Met Arg Asp Ala Ala Arg He Ala Arg Asp Ala Glu Ala 

100 105 110 

acg egg caa acg cag get etc ggc gat gcg gec age cag atg cgc cag 384 

Thr Arg Gin Thr Gin Ala Leu Gly Asp Ala Ala Ser Gin Met Arg Gin 

115 120 125 



gcg gcg age gag cgc atg gec gga gcg ate gtg gcg ggc gec atg cag 432 

35 Ala Ala Ser Glu Arg Met ala Gly Ala He Val Ala Gly Ala Met Gin 
130 135 140 

ata gcg ggt ggt ttc gtg cag ctg ggg gcg ggc ctg gca gcg ggt ttg 480 

He Ala Gly Gly Phe Val Gin Leu Gly Ala Gly Leu Ala Ala Gly Leu 
40 145 150 155 160 



cag gec atg ggt ggc get get gcg caa gec aag ggc gee gca ttc tec 528 

Gin Ala Met Gly Gly Ala Ala Ala Gin Ala Lys Gly Ala Ala Phe Ser 
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gag cag gec teg aca age cgc aag gtg gcg gee ggc ttg cac gat gee 576 

Glu Gin Ala Ser Thr Ser Arg Lys Val Ala Ala Gly Leu His Asp Ala 
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ccc gag ctg cag gca acg gtg cag gee cgc gca acc cag etc gaa gcg 624 

Pro Glu Leu Gin Ala Thr Val Gin Ala Arg Ala Thr Gin Leu Glu Ala 
195 200 205 



caa gcg gec teg ttt ggt gcg gac gcg get cgt teg teg gca aag teg 672 

55 Gin Ala Ala Ser Phe Gly Ala Asp Ala Ala Arg Ser Ser Ala Lys Ser 
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cag cgc gta teg age gtt gec cag gec ggc gee gca gcg gec ggc ggt 720 

Gin Arg Val Ser Ser Val Ala Gin Ala Gly Ala Ala Ala Ala Gly Gly 
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ate ggc ggc ctg acc age gec gec cag gaa cgc cgc gec gec gag cac 768 

lie Gly Gly Leu Thr Ser Ala Ala Gin Glu Arg Arg Ala Ala Glu His 
245 250 255 
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gag gec agg cgc gcg gag erg gac gtc gaa gcg aag gtg cat gaa acg 816 

Glu Ala Arg Arg Ala Glu Leu Asp Val Glu Ala Lys Val His Glu Thr 
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10 gee teg egg egg gee gac gaa gec atg cag cag atg etc gac ate ate 864 

Ala Ser Arg Arg Ala Asp Glu Ala Met Gin Gin Met Leu Asp lie He 
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cgc ggc ate agg gaa aag ctg gec ggg atg gag cag tec cgc age gag 912 

15 Arg Gly He Arg Glu Lys Leu Ala Gly Met Glu Gin Ser Arg Ser Glu 
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acc gec cgt age gtg gec cgc aat ate tga 942 

Thr Ala Arg Ser Val Ala Arg Asn He * 

20 305 310 
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Ala Ser Val Val Thr Gly Ala Ala Ala Thr Pro Met Leu Val Leu Ser 
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ggc atg gca ttg gtc age gec gtg aca teg ctg gec gac cag ata teg 576 

Gly Met ala Leu Val Ser Ala Val Thr Ser Leu Ala Asp Gin lie Ser 
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cga gag gcg gga ggg ccg cct ate age ctg ggc ggg ttt etc tec ggg 624 
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gac caa att gee aag ate gtc gee ggc ctg gec gtg ccc gee gtc ttg 720 

Asp Gin lie Ala Lys He Val Ala Gly Leu Ala Val Pro Ala Val Leu 
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ctg ate gaa ccc cag atg ctg ggc gaa atg gec gaa ggc gtg gee agg 768 

Leu He Glu Pro Gin Met Leu Gly Glu Met ala Glu Gly Val Ala Arg 
245 250 255 



ctg gcg ggc gec ggc gat gee acc gcg gga tac ata gee atg gcg atg 816 

25 Leu Ala Gly Ala Gly Asp Ala Thr Ala Gly Tyr He Ala Met ala Met 

260 265 270 

tec ate gtg gcg gcg ate gcg gtc gee gcg ate aat gee gec ggt acg 864 

Ser He Val Ala Ala He Ala Val Ala Ala He Asn Ala Ala Gly Thr 

30 275 280 285 

gec ggc gcg ggc age gec teg gcg ate agg ggt gec tgg gat egg gee 912 

Ala Gly Ala Gly Ser Ala Ser Ala He Arg Gly Ala Trp Asp Arg Ala 

290 295 300 
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Ala Ala Val Ala Thr Gin Val Leu Gin Gly Gly Thr Ala Val Ala Gin 
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ggc ggc gtc ggc gtg teg atg gca gtc gat cgc aaa cag gee gat etc 1008 

Gly Gly Val Gly Val Ser Met ala Val Asp Arg Lys Gin Ala Asp Leu 
325 330 335 
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385 390 395 400 
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Met Thr Val Met Ser Thr Thr lie Ser Thr Ala Pro Ser Gly Ala Ala 
15 10 15 

10 Leu Ala Pro Ser Arg lie Asp Met Arg Ala Pro Glu Pro Gly Ser Ala 

20 25 30 

Gly Glu Gly Ala Gly lie Leu Ala Pro Val Thr Thr Leu Ala Leu Ala 

35 40 45 

Ala Gly Arg Pro Ala Leu Pro Ala Ser Pro Ser Leu Arg Thr Ala Pro 
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Val Leu Asp Pro Pro Val Arg Asp Leu Ser Pro Ala Asp Leu Ala Asp 
65 70 75 80 

Leu Leu Arg Val Leu Arg Ser Arg Ala Val Asp Gly Gin Leu Ala Thr 
85 90 95 
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Gin Ala Gin Leu Asp Lys Leu Asp Ala Trp Phe Arg Lys Ala Glu Asp 

115 120 125 

Ala Glu Ser Lys Gly Trp Leu Ser Lys Val Phe Gly Trp lie Gly Lys 
25 130 135 140 

Val Leu Ala Val Val Ala Ser Ala Leu Ala Val Gly Phe Ala Ala Val 
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Ala Ser Val Val Thr Gly Ala Ala Ala Thr Pro Met Leu Val Leu Ser 
165 170 175 

30 Gly Met ala Leu Val Ser Ala Val Thr Ser Leu Ala Asp Gin lie Ser 
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Arg Glu Ala Gly Gly Pro Pro lie Ser Leu Gly Gly Phe Leu Ser Gly 
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Leu Ala Gly Arg Leu Leu Thr Ala Leu Gly Val Asp Gin Ser Gin Ala 
35 210 215 220 

Asp Gin lie Ala Lys lie Val Ala Gly Leu Ala Val Pro Ala Val Leu 
225 230 235 240 

Leu lie Glu Pro Gin Met Leu Gly Glu Met ala Glu Gly Val Ala Arg 
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40 Leu Ala Gly Ala Gly Asp Ala Thr Ala Gly Tyr lie Ala Met ala Met 

260 265 270 

Ser lie Val Ala Ala lie Ala Val Ala Ala lie Asn Ala Ala Gly Thr 
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Ala Gly Ala Gly Ser Ala Ser Ala lie Arg Gly Ala Trp Asp Arg Ala 
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Ala Ala Val Ala Thr Gin Val Leu Gin Gly Gly Thr Ala Val Ala Gin 
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Gly Gly Val Gly Val Ser Met ala Val Asp Arg Lys Gin Ala Asp Leu 
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50 Leu Val Ala Asp Lys Ala Asp Leu Ala Ala Ser Leu Thr Lys Leu Arg 

340 345 350 

Ala Ala Met Glu Arg Glu Ala Asp Asp lie Lys Lys lie Leu Ala Gin 

355 360 365 

Phe Asp Ala Ala Tyr His Met lie Ala Gin Met lie Ser Asp Met ala 
55 370 375 380 

Ser Thr His Ser Gin Val Ser Ala Asn Leu Gly Arg Arg Gin Ala Val 
385 390 395 400 
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tat tec atg ttc ccg etc cgt aga acg cga tac acc caa gga ttc gag 384 

Tyr Ser Met Phe Pro Leu Arg Arg Thr Arg Tyr Thr Gin Gly Phe Glu 
115 120 125 

5 aca acg gec cac cgc atg aac ttc cag ate cca ccc get tta cct get 432 

Thr Thr Ala His Arg Met Asn Phe Gin lie Pro Pro Ala Leu Pro Ala 

130 135 140 

ttg gag ctt gat gtc ttt gcg cgc gec gee age caa gga gag acc eta 480 

10 Leu Glu Leu Asp Val Phe Ala Arg Ala Ala Ser Gin Gly Glu Thr Leu 

145 150 155 160 

tat gtc acc aaa gca ggc gag cag ttc cag gtc ate gca tec ggc acg 528 

Tyr Val Thr Lys Ala Gly Glu Gin Phe Gin Val lie Ala Ser Gly Thr 

15 165 170 175 

acg ccg tea ggg cgc aac gta tec tgg gtc gee acc gac gag gac acg 576 

Thr Pro Ser Gly Arg Asn Val Ser Trp Val Ala Thr Asp Glu Asp Thr 

180 185 190 

20 

ctt gtc atg ttt tec age gcg ctg gcg ctg gee tac ggc acg gga ate 624 

Leu Val Met Phe Ser Ser Ala Leu Ala Leu Ala Tyr Gly Thr Gly lie 
195 200 205 

gee cgc gee gtc gee aag gag etc gat ctg cac gcg gtc ccg acg aca 672 

Ala Arg Ala Val Ala Lys Glu Leu Asp Leu His Ala Val Pro Thr Thr 

210 215 220 

teg ctg teg gcg cgc gtc gtc acg cga gcg gtc gac atg gcg gaa acc 720 

30 Ser Leu Ser Ala Arg Val Val Thr Arg Ala Val Asp Met aia Glu Thr 

225 230 235 240 

tea cgc cac gee ctg cag ggc gtg gat ttc ctt acc ttc ctg tec tgg 768 

Ser Arg His Ala Leu Gin Gly Val Asp Phe Leu Thr Phe Leu Ser Trp 

35 245 250 255 



25 



40 



45 



teg gec cgc gec gac gec gee ggc ttc cga cag gtc tgt cac gac acc 816 

Ser Ala Arg Ala Asp Ala Ala Gly Phe Arg Gin Val Cys His Asp Thr 

260 265 270 

ggt gtc tct ccc gat cag ata tec gga acg ttg cgc gee acg ate gac 864 

Gly Val Ser Pro Asp Gin lie Ser Gly Thr Leu Arg Ala Thr lie Asp 

275 280 285 

gaa age atg cag cag cgc ttc gca tec gec gca caa tea ggt aag gcg 912 

Glu Ser Met Gin Gin Arg Phe Ala Ser Ala Ala Gin Ser Gly Lys Ala 

290 295 300 



ccg gta tec gec cat acg gcg caa gaa tgg ttg cgc gag gtc ctt gcg 960 
50 Pro Val Ser Ala His Thr Ala Gin Glu Trp Leu Arg Glu Val Leu Ala 
305 310 315 320 



55 



cac cac ctg gtg tag 975 
His His Leu Val * 
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<213> Bordetella pertussis 
<400> 66 

Met Arg Phe Arg Ala Gly Tyr Ser Arg Tyr Gin Ala Arg Ser Giy Hxs 
5 1 5 10 15 

Gly Asp Arg Pro Pro Pro Ala Gin Ala Arg Val Gin Thr Val Leu Leu 

20 25 30 

His Gly Leu Ser Ala Leu Thr Ala Gin Val Ala Gin Arg Phe Glu Met 
35 40 45 

10 Ala Arg His Arg Met aia Gly Pro Gly Arg Thr Thr Gly His His His 
50 55 60 

Phe Gin Leu Glu Ala Gin Arg Met ala Asp Thr Leu Arg Ser Val Gin 
65 "70 75 80 

Gly Glu Pro Arg Trp Pro Asp Gly Ser Glu Ala Cys Met Pro Ser Gly 
15 85 90 95 

Leu Ser Cys Arg His Gly Thr Glu Glu Pro Lys Ala Ser His Ser Ala 

100 105 110 

Tyr Ser Met Phe Pro Leu Arg Arg Thr Arg Tyr Thr Gin Gly Phe Glu 
115 120 125 

ZV Thr Thr Ala His Arg Met Asn Phe Gin lie Pro Pro Ala Leu Pro Ala 
130 135 140 

Leu Glu Leu Asp Val Phe Ala Arg Ala Ala Ser Gin Gly Glu Thr Leu 
145 150 155 160 

Tyr Val Thr Lys Ala Gly Glu Gin Phe Gin Val lie Ala Ser Gly Thr 
25 165 170 175 

Thr Pro Ser Gly Arg Asn Val Ser Trp Val Ala Thr Asp Glu Asp Thr 

180 185 190 

Leu Val Met Phe Ser Ser Ala Leu Ala Leu Ala Tyr Gly Thr Gly lie 
195 200 205 

_)U Ala Arg Ala Val Ala Lys Glu Leu Asp Leu His Ala Val Pro Thr Thr 
210 215 220 

Ser Leu Ser Ala Arg Val Val Thr Arg Ala Val Asp Met ala Glu Thr 
225 230 235 240 

Ser Arg His Ala Leu Gin Gly Val Asp Phe Leu Thr Phe Leu Ser Trp 
35 245 250 255 

Ser Ala Arg Ala Asp Ala Ala Gly Phe Arg Gin Val Cys His Asp Thr 

260 265 270 

Gly Val Ser Pro Asp Gin lie Ser Gly Thr Leu Arg Ala Thr lie Asp 

275 280 285 

Glu Ser Met Gin Gin Arg Phe Ala Ser Ala Ala Gin Ser Gly Lys Ala 

290 295 300 

Pro Val Ser Ala His Thr Ala Gin Glu Trp Leu Arg Glu Val Leu Ala 
3 ° 5 310 315 320 

His His Leu Val 

45 

<210> 67 
<211> 1146 
<212> DNA 

50 <213> Bordetella pertussis 

<220> 
<221> CDS 
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<400> 67 

atg ctg ate aac gcg gec gag cac ccc gec gec age ctg gat gec gac 48 

Met Leu lie Asn Ala Ala Glu His Pro Ala Ala Ser Leu Asp Ala Asp 
1 5 io 15 



62 



WO 00/37493 



PCT/EP99/10297 



20 



25 



30 



40 



45 



50 



tgg tac egg cga gtg egg gzg ccg egg ccc ate tac gag gaa etc gtc 
Trp Tyr Arg Arg Val Arg Val Pro Arg Pro lie Tyr Glu Glu Leu Val 
20 25 30 



96 



b ggc cag cga ggc tgg ctg cac egg ate ggg ata gac gee aag gca cag 144 

Gly Gin Arg Gly Trp Leu His Arg He Gly He Asp Ala Lys Ala Gin 

35 40 45 

_ aac age ccc tgc acg teg gtt ccc gtg gee ate gee gcg cgc tgc ctg 192 

1U Asn Ser Pro Cys Thr Ser Val Pro Val Ala He Ala Ala Arg Cys Leu 

50 55 60 

aac gtc gtg ctg gcg ctg get ccc gcg cag ate gec atg ttc gee aac 240 

Asn Val Val Leu Ala Leu Ala Pro Ala Gin He Ala Met Phe Ala Asn 

15 65 70 75 80 

age ccg ctg gag gca ggg egg gtg ace ggt etc aag gaa aac cgc ctg 288 

Ser Pro Leu Glu Ala Gly Arg Val Thr Gly Leu Lys Glu Asn Arg Leu 

85 90 95 

acc ctg tgg ccg cgc atg ttc cga ggc gcg cgc tac ctg ggc gac gac 336 

Thr Leu Trp Pro Arg Met Phe Arg Gly Ala Arg Tyr Leu Gly Asp Asp 

100 105 110 



ctg ctg cat cgc ctg cct gca agg ccg ttt cgc gat etc ggc gat tat 384 

Leu Leu His Arg Leu Pro Ala Arg Pro Phe Arg Asp Leu Gly Asp Tyr 

120 125 

ttc cgc tgg atg ttc ggc gga ttg acc gec age egg gcg eta ccg ccg 432 

Phe Arg Trp Met Phe Gly Gly Leu Thr Ala Ser Arg Ala Leu Pro Pro 
130 135 14 o 



ggc gac get tgc gac tac aag aac gee gat gtg gec tgc ctg gtg gga 480 

Gly Asp Ala Cys Asp Tyr Lys Asn Ala Asp Val Ala Cys Leu Val Gly 

35 145 150 155 160 

gec cct teg ctg gca gag ttc ctg tat gcg ggc gcg tgg tec gcg cga 528 

Ala Pro Ser Leu Ala Glu Phe Leu Tyr Ala Gly Ala Trp Ser Ala Arg 
165 170 175 

aac ctg aat gat ggc ggt tec gtg cgt ctg gec gcg cgc age gaa cat 576 

Asn Leu Asn Asp Gly Gly Ser Val Arg Leu Ala Ala Arg Ser Glu His 
180 185 190 

ttc gtc tat teg cag ttc gcg cag ttc ctg gac gcg cgt tgg cgc tac 624 

Phe Val Tyr Ser Gin Phe Ala Gin Phe Leu Asp Ala Arg Trp Arg Tyr 
195 200 205 

agg atg ccg att gtc ccc gee ttg ccg gcg ctg ttg cga gec tgg gac 672 

Arg Met Pro He Val Pro Ala Leu Pro Ala Leu Leu Arg Ala Trp Asp 
210 215 220 



agg cag ggc ggc ctg gaa gcg ctg ttc gag cag gec ggc gcg caa ggc 720 
Arg Gin Gly Gly Leu Glu Ala Leu Phe Glu Gin Ala Gly Ala Gin Gly 
55 225 230 235 240 
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tac ate gag ggg C gc gcg ccg ggc gcg gta ttt gee gat gee gac ttg 
Tyr He Glu Gly Arg Ala Pro Gly Ala Val Phe Ala Asp Ala Asp Leu 
245 250 255 
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ctg age tea gec ggc gat gca etc gcg gec agt gcg ccg atg gcg acg 816 

Leu Ser Ser Ala Gly Asp Ala Val Ala Ala Ser Ala Pro Met ala Ala 

260 265 270 

5 teg gcg ctg caa ttg ggg ctg ttg cgc aat ctg cac gac gec gag gec 864 

Ser Ala Leu Gin Leu Gly Leu Leu Arg Asn Leu His Asp Ala Glu Ala 

275 280 285 

ctg gtg agg cga tgg ggc tgg ctg cgc ttg cgt gcg ttg cgc gat egg 912 

10 Leu Val Arg Arg Trp Gly Trp Leu Arg Leu Arg Ala Leu Arg Asp Arg 

290 295 300 

gec ate get ttg gcg ttg gac gat gcg cag gtg cgc tgc ctt tgc caa 960 

Ala lie Ala Leu Ala Leu Asp Asp Ala Gin Val Arg Cys Leu Cys Gin 

15 305 310 315 
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cag gtc gtg gcg gta gec gaa ggc ggg ctg gec ggc gac gag cag caa 1008 

Gin Val Val Ala Val Ala Glu Gly Gly Leu Ala Gly Asp Glu Gin Gin 
325 330 335 

tgg etc gat tat gtg cgt tac gtg gtg gaa acc ggc gag acc gee gcg 1056 

Trp Leu Asp Tyr Val Arg Tyr Val Val Glu Thr Gly Glu Thr Ala Ala 
340 345 350 

gac cgc atg ctg cgc ttg tgg cgc cag gcg cgc ggc acg cct gag atg 1104 

Asp Arg Met Leu Arg Leu Trp Arg Gin Ala Arg Gly Thr Pro Glu Met 

355 360 365 



cgc cgc gca cag gcg tgc egg cag cgc gcg gtg ctg tec tag 114 6 

30 Arg Arg Ala Gin Ala Cys Arg Gin Arg Ala Val Leu Ser * 
370 375 380 
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85 90 95 

cga aag gcc tgg ctt tac ctg gcc acg cat gcc cgc age aac tac ate 336 

Arg Lys Ala Trp Leu Tyr Leu Ala Thr His Ala Arg Ser Asn Tyr lie 
100 105 110 

gag ttc gtg ccc gat gcg tgg tgg cag ccc ggc aac ttc gac acc gcc 384 

Glu Phe Val Pro Asp Ala Trp Trp Gin Pro Gly Asn Phe Asp Thr Ala 
115 120 125 

ttg egg cct gcc gtg cgc gaa gcc gtt gcg gca cgc ctg cat ggc gcc 432 

Leu Arg Pro Ala Val Arg Glu Ala Val Ala Ala Arg Leu His Gly Ala 
130 135 140 

15 aag gac ate gac ctg ate ate gcc atg ggt acc tgg get gga cag gac 480 

Lys Asp lie Asp Leu lie lie Ala Met Gly Thr Trp Ala Gly Gin Asp 

145 150 155 160 

atg gtc gaa ctg ggc acg ccg gta ccc acc gtg gtc gtc teg teg acc 528 

20 Met Val Glu Leu Gly Thr Pro Val Pro Thr Val Val Val Ser Ser Thr 

165 170 175 

gac ccg ata age gcc egg ate ata ccc agt gcg gcc gac age ggc cag 576 

Asp Pro lie Ser Ala Arg lie lie Pro Ser Ala Ala Asp Ser Gly Gin 
25 180 185 190 

gac aac ctg cat gcc egg gta cag ccc gac cac tac cag egg cag ate 624 

Asp Asn Leu His Ala Arg Val Gin Pro Asp His Tyr Gin Arg Gin lie 
195 200 205 
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50 



cag ctg etc cat gac ate gtg ccg ttc aag acg ctt gga ctg gtc tac 672 
Gin Leu Leu His Asp lie Val Pro Phe Lys Thr Leu Gly Leu Val Tyr 
210 215 220 



35 gaa gac acc gaa gca ggt cgc acc tac gca gcc ate gat aag gtc gcc 720 

Glu Asp Thr Glu Ala Gly Arg Thr Tyr Ala Ala lie Asp Lys Val Ala 
225 230 235 240 

gca eta atg ccg gca ttg gat ttc tec gtc aag cgt tgc gac gca cgc 768 

40 Ala Leu Met Pro Ala Leu Asp Phe Ser Val Lys Arg Cys Asp Ala Arg 

245 250 255 

gcg acc ggc ate ccc ate gcc acg gca acc cag aac gtt ctg get tgc 816 

Ala Thr Gly lie Pro lie Ala Thr Ala Thr Gin Asn Val Leu Ala Cys 

45 260 265 270 

tac cag aag ctg teg age gaa gtc gac gcc ttt tac gtc acc gag cac 864 

Tyr Gin Lys Leu Ser Ser Glu Val Asp Ala Phe Tyr Val Thr Glu His 

275 280 285 



egg ggc ate acc teg acg tec gtc aag cag etc gcc gcg ctg ctg cgc 912 
Arg Gly He Thr Ser Thr Ser Val Lys Gin Leu Ala Ala Leu Leu Arg 
290 295 300 



55 gcc gcc cgc gtg ccg agt ttc teg atg caa ggc tec gac gag gtc aag 960 

Ala Ala Arg Val Pro Ser Phe Ser Met Gin Gly Ser Asp Glu Val Lys 

305 310 315 320 

gcc ggc ctg ttg atg age ctg gcc aag gcg gac tac tec age gta ggc 1008 

60 Ala Gly Leu Leu Met Ser Leu Ala Lys Ala Asp Tyr Ser Ser Val Gly 
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atg ttc cac gcc cag acc att gcc cgc att ttc aat ggg gaa aag ccg 
Met Phe His Ala Gin Thr He Ala Arg He Phe Asn Gly Glu Lys Pro 
3 40 345 350 



1056 



cgc age ate age cag gtc tgg aat gcc ccc gcc aag ata gcc ate aat 1104 

Arg Ser He Ser Gin Val Trp Asn Ala Pro Ala Lys He Ala He Asn 

355 360 365 

ctg gaa acg gcg egg cgc ate ggc ttc gac cca ccg gtg gat att ctg 1152 

Leu Glu Thr Ala Arg Arg He Gly Phe Asp Pro Pro Val Aso He Leu 

370 375 380 

ctg gcg gcc gac gag gtg tac gaa gcg gag cac tga cag gcc tgg cca 1200 

Leu Ala Ala Asp Glu Val Tyr Glu Ala Glu His * Gin Ala Trp Pro 

385 390 395 



acg aga cct ggc aag gaa tgt gcc gga tec tag 1233 
Thr Arg Pro Gly Lys Glu Cys Ala Gly Ser * 
400 405 



<210> 70 
<211> 409 
<212> PRT 

<213> Bordetella pertussis 
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Ala Pro Leu Ala 
10 

Ser Ser Pro Pro 
25 

Ser Pro Arg Leu 

He Gly Tyr Val 
60 

Tyr Ala He Ala 
75 

Asp Met Pro Glu 
90 

Thr His Ala Arg 
105 

Gin Pro Gly Asn 

Val Ala Ala Arg 
140 

Met Gly Thr Trp 
155 

Pro Thr Val Val 
170 

Pro Ser Ala Ala 
185 

Pro Asp His Tyr 

Phe Lys Thr Leu 
220 

Tyr Ala Ala He 
235 

Ser Val Lys Arg 



Leu Leu Leu Gly 
15 

Pro Val Ala Ala 
30 

Pro Pro Pro Ser 
45 

Gly Ser Gly Glu 

Arg Ala Leu Gin 
80 

He Thr Asp Met 
95 

Ser Asn Tyr He 
110 

Phe Asp Thr Ala 
125 

Leu His Gly Ala 

Ala Gly Gin Asp 
160 

Val Ser Ser Thr 
175 

Asp Ser Gly Gin 
190 

Gin Arg Gin He 
205 

Gly Leu Val Tyr 

Asp Lys Val Ala 
240 

Cys Asp Ala Arg 
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325 


Ser 
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330 
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Ser 
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Val 
335 


Gly 


Met 
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His 
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Thr 


He 


Ala 


Arg 


He 


Phe 


Asn 


Gly 


Glu 


Lys 


Pro 








340 










345 








350 




Arg 


Ser 


lie 


Ser 
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Val 


Trp 


Asn 


Ala 


Pro 


Ala 


Lys 


He 


Ala 


He 


Asn 






355 










360 








365 
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Glu 


Thr 
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Arg 


Arg 


He 


Gly 


Phe 


Asp 


Pro 


Pro 


Val 


Asp 


He 


Leu 




370 










375 










380 
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Ala 


Ala 


Asp 


Glu 


Val 


Tyr 


Glu 


Ala 


Glu 


His 


Gin 


Ala 


Trp 


Pro 


Thr 


385 










390 










395 








400 


Arg 


Pro 


Gly 


Lys 


Glu 
405 


Cys 


Ala 
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10 



15 



20 

<210> 71 
<211> 645 
25 <212> DNA 

<213> Bordetella pertussis 

<220> 
<221> CDS 
30 <222> (1) . . . (645) 

<400> 71 

atg gag cag ctg gat ctg ccg ctg gta gtg gtc ggg ctg tac ccg ggc 4 8 

Met Glu Gin Leu Asp Leu Pro Leu Val Val Val Gly Leu Tyr Pro Gly 
35 1 5 10 15 

atg cag gtt gtc ctg gcc get gtc ggc cgc act ggg tat gat ccg ggc 96 
Met Gin Val Val Leu Ala Ala Val Gly Arg Thr Gly Tyr Asp Pro Gly 
20 25 30 

get tat egg gtc ggt cga cga gac gac cac ggt ggg tac egg cgt gcc 14 4 

Ala Tyr Arg Val Gly Arg Arg Asp Asp His Gly Gly Tyr Arg Arg Ala 
35 40 45 



45 cag ttc gac cat gtc ctg tec age cca ggt acc cat ggc gat gat cag 
Gin Phe Asp His Val Leu Ser Ser Pro Gly Thr His Gly Asp Asp Gin 
50 55 60 



192 



gt ° gat gtC Ctt ggc gcc atg cag gcg tgc cgc aac ggc ttc gcg cac 240 

50 Val Asp Val Leu Gly Ala Met Gin Ala Cys Arg Asn Gly Phe Ala His 

65 70 75 80 

ggc agg ccg caa ggc ggt gtc gaa gtt gcc ggg ctg cca cca cgc ate 288 

Gly Arg Pro Gin Gly Gly Val Glu Val Ala Gly Leu Pro Pro Arg He 

55 85 90 95 

ggg cac gaa etc gat gta gtt get gcg ggc atg cgt ggc cag gta aag 336 

Gly His Glu Leu Asp Val Val Ala Ala Gly Met Arg Gly Gin Val Lys 

100 105 110 

60 
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cca ggc ctt teg cat ate ggt tat etc ggg cat gtc gtc gat acg cag 384 

Pro Gly Leu Ser His lie Gly Tyr Leu Gly His Val Val Asp Thr Gin 

115 120 125 

5 cca tec gag ttg ttg caa tgc gcg cgc gat cgc gta gag cgt gcg egg 432 

Pro Ser Glu Leu Leu Gin Cys Ala Arg Asp Arg Val Glu Arg Ala Arg 
130 135 140 

ata etc etc gta etc gee get ace cac ata ace gat gcg cca ttt ccg 480 

10 lie Leu Leu Val Leu Ala Ala Thr His lie Thr Asp Ala Pro Phe Pro 
145 150 155 160 

gec gga tgt atg gga ggg agg egg aag gcg ggg cga ggt aag ggc gac 528 

Ala Gly Cys Met Gly Gly Arg Arg Lys Ala Gly Arg Gly Lys Gly Asp 

15 165 170 175 

gee gga get cag ggc cgc gac agg agg agg get gga tgc cgc cgc gat 576 

Ala Gly Ala Gin Gly Arg Asp Arg Arg Arg Ala Gly Cys Arg Arg Asp 
180 185 190 



20 



ggg cca cgc gag gee aag aag cag ggc gag egg ggc gag tat tec ggg 624 
Gly Pro Arg Glu Ala Lys Lys Gin Gly Glu Arg Gly Glu Tyr Ser Gly 
195 200 205 



25 gcg aag ggt cat ggg cga tga 64 5 

Ala Lys Gly His Gly Arg 
210 



30 



35 



40 



45 



50 



55 



60 
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195 200 205 

Ala Lys Gly His Gly Arg 
210 

<210> 73 
<211> 1314 
<212> DNA 

<213> Bordetella pertussis 
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150 
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160 



48 
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144 



192 



240 



288 



336 



384 



432 



180 



55 ctg aac ctg gee ccg ate gec gee aag ggc gee atg ggt teg tec ggc 528 

Leu Asn Leu Ala Pro He Ala Ala Lys Gly Ala Met Gly Ser Ser Gly 
165 170 175 

ttc gag gee age atg gcg ttg atg acc ate ctg tgc gtg ggc ggc ate 576 

60 Phe Glu Ala Ser Met ala Leu Met Thr He Leu Cys Val Gly Gly He 
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10 



30 



50 



55 



60 



180 135 190 

gcc gtc tac acg cgc ggc atg gtg cag egg ctg ctg ate ctg gtc ggc 624 

Ala Val Tyr Thr Arg Gly Met Val Gin Arg Leu Leu He Leu Val Gly 

195 200 205 



ctg gtg ctg gcc tgc gtc ate tac gcg gtc tgc gcc aac ggc ctg ggg 

Leu Val Leu Ala Cys Val He Tyr Ala Val Cys Ala Asn Gly Leu Gly 

210 215 220 

ctg ggc gcg ccc atg gac ttc gcc aag gtg gcc gcc gcg ccg tgg ttc 

Leu Gly Ala Pro Met Asp Phe Ala Lys Val Ala Ala Ala Pro Trp Phe 

225 230 235 240 



672 



720 



15 ggc ctg ccc age ttc gcc gcg ccg gtg ttc gag ccg cag gcc atg ggc 768 

Gly Leu Pro Ser Phe Ala Ala Pro Val Phe Glu Pro Gin Ala Met Gly 

245 250 255 

ctg ate gtg ccg gtg gcc ate ate ctg gtg gcc gag aac ctg ggc cac 816 

20 Leu He Val Pro Val Ala He He Leu Val Ala Glu Asn Leu Gly His 

260 265 270 

gtg aag gcg gtc gcc gcc atg acc gga cag gac ctg gac cgc tac gtg 864 

Val Lys Ala Val Ala Ala Met Thr Gly Gin Asp Leu Asp Arg Tyr Val 

25 275 280 285 



ggc cgc gcc ttc gtg ggc gac ggc gtg gcg acc atg gtt tec ggc gcc 912 

Gly Arg Ala Phe Val Gly Asp Gly Val Ala Thr Met Val Ser Gly Ala 

290 295 300 

gtc ggc ggc acc ggg gtg acc acc tac gcc gag aat ate ggc gtg atg 960 

Val Gly Gly Thr Gly Val Thr Thr Tyr Ala Glu Asn He Gly Val Met 
305 310 315 320 

35 gcc gtg acg cgc ate tat tec acg ctg gtg ttc gtg gtg gcg gcc gtg 1008 

Ala Val Thr Arg lie Tyr Ser Thr Leu Val Phe Val Val Ala Ala Val 
325 330 335 

atC gCg Ctg gtg Ctg ggg ttc tcg ccc aa 9 ttc 99 c 9 C< ? ct 9 atc ca 9 1056 

40 He Ala Leu Val Leu Gly Phe Ser Pro Lys Phe Gly Ala Leu lie Gin 

340 345 350 

acc atc ccc ggc ccc gtg ctg ggg ggc atg tcg gtc gtg gtg ttc ggc 1104 

Thr He Pro Gly Pro Val Leu Gly Gly Met Ser Val Val Val Phe Gly 

45 355 360 365 



ctg atc gcc atc gcc ggc gcg cgc atc tgg gtg gtc aac cag gtc gat 1152 

Leu lie Ala He Ala Gly Ala Arg He Trp Val Val Asn Gin Val Asp 
370 375 380 

ttc age gac aac cgc aat ctg atc gtg gcc gcc gtg acc ctg gtg ctg 1200 

Phe Ser Asp Asn Arg Asn Leu lie Val Ala Ala Val Thr Leu Val Leu 

385 390 395 400 

ggg gcg ggc gac ttc age gtc aag etg ggc gat ttc tcg atg aac ggc 1248 

Gly Ala Gly Asp Phe Ser Val Lys Leu Gly Asp Phe Ser Met Asn Gly 
405 410 415 

atc ggc acc gcc acg ttc ggc gcc atc atc ctg tac gcc ctg ctg ggc 1296 

He Gly Thr Ala Thr Phe Gly Ala He lie Leu Tyr Ala Leu Leu Gly 
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420 



425 



430 



ctg gcg cgt cgc cgc tga 
Leu Ala Arg Arg Arg * 
435 



1314 



<210> 
<211> 

10 <212> 
<213> 

<400> 
Met Ser Asn 

15 l 

Pro Gly Ala 

lie Ala Met 
35 

20 Leu Ala Pro 
50 

Ser Gly lie 
65 

Pro Ser Tyr 

25 

Val Thr Gly 

Leu Gly Ala 
115 

30 Val Val Trp 
130 

Ala Met Met 
145 

Leu Asn Leu 

35 

Phe Glu Ala 

Ala Val Tyr 
195 

40 Leu Val Leu 
210 
Leu Gly Ala 
225 

Gly Leu Pro 

45 

Leu lie Val 

Val Lys Ala 
275 

50 Gly Arg Ala 
290 
Val Gly Gly 
305 

Ala Val Thr 

55 

He Ala Leu 

Thr He Pro 
355 

60 Leu lie Ala 
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325 






Val 
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Gly 


Phe 


340 








Gly 


Pro 


Val 


Leu 


He 


Ala 


Gly 
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ssis 



Arg Trp Arg Leu Ala 
10 

Asp Glu Arg Leu Ser 
25 

Val Val Ala Met Phe 
40 

Phe Asp Pro Asn Val 
60 

Phe Phe Leu Phe Val 
75 

Phe Ala Phe He Gly 
90 

Gly Ala Asn Ala Asn 
105 

Gly Leu Val Tyr Ala 
120 

Arg Gly Asn Gly Ala 
140 

Thr Gly Ala Val Val 
155 

Ala Lys Gly Ala Met 
170 

Met Thr He Leu Cys 
185 

Val Gin Arg Leu Leu 
200 

Tyr Ala Val Cys Ala 
220 

Ala Lys Val Ala Ala 
235 

Pro Val Phe Glu Pro 
250 

He Leu Val Ala Glu 
265 

Thr Gly Gin Asp Leu 
280 

Gly Val Ala Thr Met 
300 

Thr Tyr Ala Glu Asn 
315 

Thr Leu Val Phe Val 
330 

Ser Pro Lys Phe Gly 
345 

Gly Gly Met Ser Val 
360 

Arg He Trp Val Val 



Asp Asp Thr Val 
15 

Trp Pro Lys Asn 

30 

Gly Ser Thr Val 
45 

Ala He Leu Met 

Gly Gly Arg Val 
80 

Gly Val Val Ala 
95 

He Gly Val Ala 
110 

Leu He Gly Leu 
125 

Arg Trp lie Glu 

Ala Val He Gly 
160 

Gly Ser Ser Gly 
175 

Val Gly Gly He 
190 

He Leu Val Gly 
205 

Asn Gly Leu Gly 

Ala Pro Trp Phe 
240 

Gin Ala Met Gly 
255 

Asn Leu Gly His 
270 

Asp Arg Tyr Val 
285 

Val Ser Gly Ala 

He Gly Val Met 
320 

Val Ala Ala Val 
335 

Ala Leu He Gin 
350 

Val Val Phe Gly 
365 

Asn Gin Val Asp 
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10 



15 



40 



370 375 380 

Phe Ser Asp Asn Arg Asn Leu lie Val Ala Ala Val Thr Leu Val Leu 
385 390 395 400 

Gly Ala Gly Asp Phe Ser Val Lys Leu Gly Asp Phe Ser Met Asn Gly 

405 410 415 

lie Gly Thr Ala Thr Phe Gly Ala He He Leu Tyr Ala Leu Leu Gly 

420 425 430 

Leu Ala Arg Arg Arg 
435 

<210> 75 
<211> 1536 
<212> DNA 

<213> Bordetella pertussis 



<220> 

<221> CDS 

<222> (1) . , . (1536) 

20 <400> 75 

atg ctt gaa agg ate aag gtc cgc acc gcg atg gtg gcg gta ttc gcg 4 8 

Met Leu Glu Arg He Lys Val Arg Thr Ala Met Val Ala Val Phe Ala 
15 10 15 

25 tgc ttc ctg gcg gtg ctg atg ctg teg ggc gec ctg acg tgg cgc aac 96 
Cys Phe Leu Ala Val Leu Met Leu Ser Gly Ala Leu Thr Trp Arg Asn 
20 25 30 

gcg ggc agg age gec gec gag ate gag ggg ctg aac cag gtc gec gtc 144 
30 Ala Gly Arg Ser Ala Ala Glu He Glu Gly Leu Asn Gin Val Ala Val 
35 40 45 

aac cag gtc gac ccg ctg ttc gag gec age ggc gcg gcg cag cgc cag 192 
Asn Gin Val Asp Pro Leu Phe Glu Ala Ser Gly Ala Ala Gin Arg Gin 
35 50 55 60 

gcg gec acg caa ttc cag cgc tac gtg gac gtg ccc aag gag ccg gec 24 0 

Ala Ala Thr Gin Phe Gin Arg Tyr Val Asp Val Pro Lys Glu Pro Ala * 
65 70 75 80 



gcg gec gag ctg gec gcg acc ctg cag acg cgc tgg cgc gec tac cag 288 
Ala Ala Glu Leu Ala Ala Thr Leu Gin Thr Arg Trp Arg Ala Tyr Gin 
85 90 95 



45 teg gtg ctg gac gag ctg gec gec gee gtc gac gec ggc cag gec gag 336 

Ser Val Leu Asp Glu Leu Ala Ala Ala Val Asp Ala Gly Gin Ala Glu 
100 105 110 

ccc gee ctg gec gec atg cat cgc gcg cag cag gee gaa cat gca ttc 384 

50 Pro Ala Leu Ala Ala Met His Arg Ala Gin Gin Ala Glu His Ala Phe 
115 120 125 

cag cgc gac atg gaa gee ttt ctg gec agg gta cag gcg cac age gac 432 

Gin Arg Asp Met Glu Ala Phe Leu Ala Arg Val Gin Ala His Ser Asp 
55 130 135 140 

gaa gtg cgc age ggc gec gag gac acc cat gtc gtg gee cgc tgg age 480 
Glu Val Arg Ser Gly Ala Glu Asp Thr His Val Val Ala Arg Trp Ser 

145 150 155 160 
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gcc ate gcg ctg acc acg ctg ggc grg ctg ctg acc ctg gec ggc tgg 528 

Ala lie Ala Leu Thr Thr Leu Gly Val Leu Leu Thr Leu Ala Gly Trp 

165 170 175 

5 ctg ttc gtg cgc cgc gcg gtg ctg cgc ccc ttg ctg gag gcc ggc cat 576 

Leu Phe Val Arg Arg Ala Val Leu Arg Pro Leu Leu Glu Ala Gly His 

180 185 190 

cat ttc gac cgc ate gcc gac ggc gac etc acc gcg cgc ate gag gtg 624 

10 His Phe Asp Arg lie Ala Asd Gly Asp Leu Thr Ala Arg lie Glu Val 

195 200 205 

cgc teg gcc aat gaa ate ggc gcg ctg ttc gcg gcg etc aag cgc atg 672 

Arg Ser Ala Asn Glu lie Gly Ala Leu Phe Ala Ala Leu Lys Arg Met 
15 210 215 220 

cag gaa ggc ctg acg cgc acc ate gcc gtc atg egg cgc ggc gtc gac 720 

Gin Glu Gly Leu Thr Arg Thr lie Ala Val Met Arg Arg Gly Val Asp 

225 230 235 240 



20 



40 



gaa ate aac gtc ggc gcg gcc gag ate teg gcc ggc aac gcc aac ctg 7 68 

Glu lie Asn Val Gly Ala Ala Glu lie Ser Ala Gly Asn Ala Asn Leu 
245 250 255 



25 tec age cgc acg gag gag cag gcc gcc gcc ctg gaa gag acc gcg gcc 816 
Ser Ser Arg Thr Glu Glu Gin Ala Ala Ala Leu Glu Glu Thr Ala Ala 
260 265 270 

acc atg gag gaa ctg gcc acc acg gtc aag cag aac gcc gac aat gcc 864 
30 Thr Met Glu Glu Leu Ala Thr Thr Val Lys Gin Asn Ala Asp Asn Ala 
275 280 285 

gcg cag gcc aat cag ctg gcc gcc gtc age atg cag gtg gcg cag cgc 912 
Ala Gin Ala Asn Gin Leu Ala Ala Val Ser Met Gin Val Ala Gin Arg 
35 290 295 300 

ggc ggc gag teg gtc gcg cag gtg gtg cag acc atg cac ggc ate tec 960 
Gly Gly Glu Ser Val Ala Gin Val Val Gin Thr Met His Gly lie Ser 
305 310 315 320 



gcg age teg cgc cag ate gcc gac ate gtc acc gtg ate gac ggc ate 1008 
Ala Ser Ser Arg Gin lie Ala Asp lie Val Thr Val lie Asp Gly lie 
325 330 335 



45 gcc ttc cag acc aat ate ctg gcg ctg aac gcc gcg gtc gag gcg gcg 1056 

Ala Phe Gin Thr Asn lie Leu Ala Leu Asn Ala Ala Val Glu Ala Ala 

340 345 350 

cgc gcc ggc gaa cag ggc aag ggc ttc gcg gtg gtg gcg ggc gag gtg 1104 

50 Arg Ala Gly Glu Gin Gly Lys Gly Phe Ala Val Val Ala Gly Glu Val 

355 360 365 

cgc age ctg gcc cag cgc gcc gcg cag gcg gcc aag gag ate aag gcc 1152 

Arg Ser Leu Ala Gin Arg Ala Ala Gin Ala Ala Lys Glu lie Lys Ala 

55 370 375 380 

ctg ate gag age teg gtg gcg acg gtg cgc gcc ggc teg caa cag gtc 1200 

Leu lie Glu Ser Ser Val Ala Thr Val Arg Ala Gly Ser Gin Gin Val 

385 390 395 400 
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gcc age gcc ggc ggc acc arg gac gag gtg gtg gec teg gta cag cgc 1248 

Ala Ser Ala Gly Gly Thr Met Asp Glu Val Val Ala Ser Val Gin Arg 

405 410 415 

5 gtg gcc gac ate atg ggg gag ate ucg gcc gcc teg gcc cag cag gcc 1296 

Val Ala Asp lie Met Gly Glu lie Ser Ala Ala Ser Ala Gin Gin Ala 
420 425 430 

age ggc ate gac cag gtc age ctg gcg att teg caa atg gac gaa acc 1344 

10 Ser Gly lie Asp Gin Val Ser Leu Ala lie Ser Gin Met Asp Glu Thr 
435 440 445 

acc cag cag aat gcc gcg c-g gtc gaa cag gcc gcg gcg gcg gcc acg 1392 

Thr Gin Gin Asn Ala Ala Leu Val Glu Gin Ala Ala Ala Ala Ala Thr 
15 450 455 460 
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gcc atg gaa gaa cag gcc cgc cac ctg gcg gcc gcg gcg gcg gtc ttc 1440 

Ala Met Glu Glu Gin Ala Arg His Leu Ala Ala Ala Ala Ala Val Phe 

465 470 475 480 

agg acg cag ggc ggc gcc ate ate gac gtc gcc gcc gcg ccg ctg gcc 1488 

Arg Thr Gin Gly Gly Ala lie He Asp Val Ala Ala Ala Pro Leu Ala 

485 490 495 

ggg ccg gcg ggc ggc cat gcc gcc ctg ccg ccg gcc gcg gcc cac tga 1536 

Gly Pro Ala Gly Gly His Ala Ala Leu Pro Pro Ala Ala Ala His * 

500 505 510 
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<210> 77 
<211> 477 
<212> DNA 

<213> Bordetella pertussis 

<220> 

<221> CDS 

<222> (1) . . . (477) 

50 <400> 77 

atg tct gcc att cct ttg acc gtg cgc ggg gcc gag cgc ttg cag caa 4 8 

Met Ser Ala He Pro Leu Thr Val Arg Gly Ala Glu Arg Leu Gin Gin 
15 10 15 

55 gaa ctg cat egg ctt aag acc gtt gag cgt cct gcg gtg ate age gcc 96 
Glu Leu His Arg Leu Lys Thr Val Glu Arg Pro Ala Val He Ser Ala 
20 25 30 

att gcg gag gcg cgt gcg cag ggt gat ttg teg gaa aat gcc gag tac 14 4 

60 He Ala Glu Ala Arg Ala Gin Gly Asp Leu Ser Glu Asn Ala Glu Tyr 
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35 40 45 

gac gcc gcc cgc gaa cgc cag ggc ttc ate gaa ggc egg ate tec gaa 192 

Asp Ala Ala Arg Glu Arg Gin Gly Phe lie Glu Gly Arg lie Ser Glu 
50 55 60 

etc gag ggc acg ctt teg aac gcg cac etc ate gat cca acg gcg etc 240 

Leu Glu Gly Thr Leu Ser Asn Ala His Leu lie Asp Pro Thr Ala Leu 
65 70 75 80 

gac gcc gaa ggc cgt gcc gtg ttc ggc gcg acc gtg gaa ate gaa gac 288 

Asp Ala Glu Gly Arg Ala Val Phe Gly Ala Thr Val Glu lie Glu Asp 
85 90 95 

etc gac teg ggc gac cgc ctg acc tac cag ate gtg ggc gac gtc gaa 336 

Leu Asp Ser Gly Asp Arg Leu Thr Tyr Gin lie Val Gly Asp Val Glu 
100 105 no 

gcc gac ate aag tec aac ctg att teg gtc tec age ccg gtg gcc cgc 384 

Ala Asp lie Lys Ser Asn Leu lie Ser Val Ser Ser Pro Val Ala Arg 
115 120 125 



gcc ctg ate ggc aaa tec gag ggc gat gtg gtc gaa gtg aag gtg ccg 432 
Ala Leu lie Gly Lys Ser Glu Gly Asp Val Val Glu Val Lys Val Pro 
25 130 135 140 



get ggc gtg cgc gag tac gaa gtc ate ggt gtg cgt tat etc tga 477 
Ala Gly Val Arg Glu Tyr Glu Val lie Gly Val Arg Tyr Leu * 
145 150 155 



<210> 78 
<211> 158 
<212> PRT 

35 <213> Bordetella pertussis 



40 





<400> 


78 


























Met 


Ser 


Ala 


He 


Pro 


Leu 


Thr 


Val 


Arg 


Gly 


Ala 


Glu 


Arg 


Leu 


Gin 


Gin 


1 








5 










10 








15 




Glu 


Leu 


His 


Arg 


Leu 


Lys 


Thr 


Val 


Glu 


Arg 


Pro 


Ala 


Val 


He 


Ser 


Ala 


lie 






20 










25 










30 






Ala 


Glu 


Ala 


Arg 


Ala 


Gin 


Gly 


Asp 


Leu 


Ser 


Glu 


Asn 


Ala 


Glu 


Tyr 






35 










40 










45 






Asp 


Ala 


Ala 


Arg 


Glu 


Arg 


Gin 


Gly 


Phe 


He 


Glu 


Gly 


Arg 


He 


Ser 


Glu 




50 










55 










60 








Leu 


Glu 


Gly 


Thr 


Leu 


Ser 


Asn 


Ala 


His 


Leu 


He 


Asp 


Pro 


Thr 


Ala 


Leu 


65 


Ala 








70 










75 








80 


Asp 


Glu 


Gly 


Arg 


Ala 


Val 


Phe 


Gly 


Ala 


Thr 


Val 


Glu 


He 


Glu 


Asp 


Leu 








85 










90 










95 


Asp 


Ser 


Gly 


Asp 


Arg 


Leu 


Thr 


Tyr 


Gin 


He 


Val 


Gly 


Asp 


Val 


Glu 


Ala 


Asp 




100 










105 








110 






He 


Lys 


Ser 


Asn 


Leu 


He 


Ser 


Val 


Ser 


Ser 


Pro 


Val 


Ala 


Arg 


Ala 




115 










120 










125 






Leu 


He 


Gly 


Lys 


Ser 


Glu 


Gly 


Asp 


Val 


Val 


Glu 


Val 


Lys 


Val 


Pro 




130 










135 










140 








Ala 


Gly 


Val 


Arg 


Glu 


Tyr 


Glu 


Val 


He 


Gly 


Val 


Arg 


Tyr 


Leu 






145 










150 










155 









45 



50 



55 



<210> 79 
60 <211> 951 



77 



WO 00/37493 



PCT/EP99/10297 



<212> DNA 

<213> Bordetella pertussis 
<220> 

5 <221> CDS 

<222> (1) . . . (951) 

<400> 79 

atg aac acc cat aag cat gcc cga ttg acc ttc eta cgt cga etc gaa 48 

10 Met Asn Thr His Lys His Ala Arg Leu Thr Phe Leu Arg Arg Leu Glu 
15 10 15 

atg gtc cag caa ttg ate gcc cat caa gtt tgt gtg cct gaa gcg gcc 96 

Met Val Gin Gin Leu lie Ala His Gin Val Cys Val Pro Glu Ala Ala 
15 20 25 30 

cgc gcc tat ggg gtc acc gcg ccg act gtg cgc aaa tgg ctg ggc cgc 144 

Arg Ala Tyr Gly Val Thr Ala Pro Thr Val Arg Lys Trp Leu Gly Arg 
35 40 45 



20 



40 



ttc ctg get cag ggc cag gcg ggc ttg gcc gat gcg tec teg cgc ccg 192 
Phe Leu Ala Gin Gly Gin Ala Gly Leu Ala Asp Ala Ser Ser Arg Pro 
50 55 60 



25 acg gtc teg ccc cga gcg att gcg ccg gcc aag gcg ctg get ate gtg 240 

Thr Val Ser Pro Arg Ala lie Ala Pro Ala Lys Ala Leu Ala lie Val 

65 70 75 80 

gag ctg cgc cgc aag egg ctg acc caa gcg cgc ate gcc cag gcg ctg 288 

30 Glu Leu Arg Arg Lys Arg Leu Thr Gin Ala Arg lie Ala Gin Ala Leu 

85 90 95 

ggc gtg tea gcc age acc gtc age cgc gtc ctg gcc cgc gcc ggt ctg 336 

Gly Val Ser Ala Ser Thr Val Ser Arg Val Leu Ala Arg Ala Gly Leu 

35 100 105 110 

teg cac ctg gcc gac ctg gag ccg gcc gag ccg gtg gtg cgc tac gag 384 

Ser His Leu Ala Asp Leu Glu Pro Ala Glu Pro Val Val Arg Tyr Glu 

115 120 125 



cat cag gcc ccc ggc gat ctg ctg cac ate gac ate aag aag ctg gga 4 32 

His Gin Ala Pro Gly Asp Leu Leu His lie Asp lie Lys Lys Leu Gly 
130 135 140 



45 cgt ate cag cgc cct ggc cac egg gtc acg ggc aac cga cgc gat acc 4 80 

Arg lie Gin Arg Pro Gly His Arg Val Thr Gly Asn Arg Arg Asp Thr 
145 150 155 ( 160 

gtt gag ggg gcc ggc tgg gac ttc gtc ttc gtg gcc ate gat gac cac 528 

50 Val Glu Gly Ala Gly Trp Asp Phe Val Phe Val Ala lie Asp Asp His 

165 170 175 

gcc cgc gtg gcc ttc acc gac ate cac ccc gac gag cgc ttc ccc age 576 

Ala Arg Val Ala Phe Thr Asp lie His Pro Asp Glu Arg Phe Pro Ser 
55 180 185 190 

gcc gtc cag ttc etc aag gac gca gtg gcc tac tac cag cgc ctg ggc 624 

Ala Val Gin Phe Leu Lys Asp Ala Val Ala Tyr Tyr Gin Arg Leu Gly 
195 200 205 
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gtg acc ate cag cgc ttg etc acc gac aat ggc teg gec ttc cgc age 672 

Val Thr lie Gin Arg Leu Leu Thr Asp Asn Gly Ser Ala Phe Arg Ser 

210 215 220 

cgc gee ttc gee gcg ctg cgc cat gag ctg ggc ate aag cac cgc ttt 720 

Arg Ala Phe Ala Ala Leu Cvs His Glu Leu Gly lie Lys His Arg Phe 

225 230 235 240 

acc cga cct tac cgc cca cag acc aat ggc aag gec gaa cgc ttc ate 768 

Thr Arg Pro Tyr Arg Pro Gin Thr Asn Gly Lys Ala Glu Arg Phe He 

245 250 255 



cag teg gec ttg cgt gag tgg get tac get cac acc tac cag aac tec 816 
Gin Ser Ala Leu Arg Glu Trp Ala Tyr Ala His Thr Tyr Gin Asn Ser 
15 260 265 270 



caa cac cga gee gat gec atg aaa tec tgg eta cac cac tac aac tgg 8 64 

Gin His Arg Ala Asp Ala Met Lys Ser Trp Leu His His Tyr Asn Trp 
275 280 285 

cat cga ccc cac caa ggc ate ggg cgc get gta ccc ate tec aga etc 912 

His Arg Pro His Gin Gly He Gly Arg Ala Val Pro lie Ser Arg Leu 
290 295 300 



25 aac ctg gac gaa tac aac eta ttg aca gtt cac acc tag 951 
Asn Leu Asp Glu Tyr Asn Leu Leu Thr Val His Thr * 
305 310 315 
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130 135 140 

tac ttc ccc gcg tgg aag tgg gtt etc gec ate tec gat age tea caa 480 

Tyr Phe Pro Ala Trp Lys Trp Val Leu Ala lie Ser Asp Ser Ser Gin 

145 150 155 160 

gee ate ate gac aag gtt gee gee cag aaa gee aac atg att gee gcg 528 

Ala lie lie Asp Lys Val Ala Ala Gin Lys Ala Asn Met lie Ala Ala 

165 170 175 

ata gac egg aac ctg teg gag ctg egg etc age cgc cat ggt ttc gtg 576 

lie Asp Arg Asn Leu Ser Glu Leu Arg Leu Ser Arg His Gly Phe Val 

180 185 190 



15 ttc gtg gtt gcg gac gat ggc acg gtg ate gtg ccg cca ccc cca teg 624 

Phe Val Val Ala Asp Asp Gly Thr Val He Val Pro Pro Pro Pro Ser 

195 200 205 

gec gec egg ctg ctg gac teg aca gac gtc gaa teg gga egg gta ttg 672 

20 Ala Ala Arg Leu Leu Asp Ser Thr Asp Val Glu Ser Gly Arg Val Leu 

210 215 220 

cat teg atg ctt gec gaa ate teg tct acc cgc ggc ctg acg ttg cgc 720 

His Ser Met Leu Ala Glu He Ser Ser Thr Arg Gly Leu Thr Leu Arg 
25 225 230 235 240 

ttt acc aac ggc gaa age gec tgg cag ate gac gec ctg cga tac aag 768 

Phe Thr Asn Gly Glu Ser Ala Trp Gin lie Asp Ala Leu Arg Tyr Lys 

245 250 255 

30 

ccg ctg cat tgg acc ate ate ggt gtc gtt ccc gag ccg gac ctg acc 816 

Pro Leu His Trp Thr lie He Gly Val Val Pro Glu Pro Asp Leu Thr 

260 265 270 

35 gac ccg gca cag aat ctg gtg cgc egg cag gca ctg ate ttc gec gec 864 

Asp Pro Ala Gin Asn Leu Val Arg Arg Gin Ala Leu lie Phe Ala Ala 

275 280 285 

acc ttg ctg gec ggg ctg atg ctg gca tgg gtg gtg gcg gtg cgc ate 912 

40 Thr Leu Leu Ala Gly Leu Met Leu Ala Trp Val Val Ala Val Arg He 

290 295 300 



gec egg ccg ttg gcg caa ctg age aac tac get cgc cag ctt ccc acc 960 
Ala Arg Pro Leu Ala Gin Leu Ser Asn Tyr Ala Arg Gin Leu Pro Thr 

315 320 



45 305 310 



cag gac etc acc gag ccg ate egg gtt ccg ccg teg gtg gca tgc ctg 1008 

Gin Asp Leu Thr Glu Pro lie Arg Val Pro Pro Ser Val Ala Cys Leu 
325 330 335 

ccg cgc egg egg cgc gac gaa gtc gga cag etc gec gaa teg ttc ctg 1056 

Pro Arg Arg Arg Arg Asp Glu Val Gly Gin Leu Ala Glu Ser Phe Leu 
340 345 350 



55 ttc atg aac gaa cag ctg cac cac aat gtg egg gec ctg atg gcg cag 
Phe Met Asn Glu Gin Leu His His Asn Val Arg Ala Leu Met ala Gin 
355 360 365 



1104 



ata teg aac cgc gaa cgc etc gaa age gaa ttg age ate gec cgc tec 1152 
60 He Ser Asn Arg Glu Arg Leu Glu Ser Glu Leu Ser He Ala Arg Ser 
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370 375 380 

ate caa ctt ggc ctg ctt ccc cag ccg ttg ccc gat gcg gec acg cgc 1200 

lie Gin Leu Gly Leu Leu Pro Gin Pro Leu Pro Asp Ala Ala Thr Arg 

385 390 395 400 

ggc age cag ttg cgt gec gtc atg tac ccg gec egg gag gtc ggt ggg 1248 

Gly Ser Gin Leu Arg Ala Val Met Tyr Pro Ala Arg Glu Val Gly Gly 

405 410 415 

gat ttc tac gac tac ttc gtg ctg gca gac ggg cgt ctg tgc ttt gec 1296 

Asp Phe Tyr Asp Tyr Phe Val Leu Ala Asp Gly Arg Leu Cys Phe Ala 

420 425 430 



15 ate ggc gac gta tec gga aaa ggc gtg ccc gcg gec ctg ttc atg gec 1344 

lie Gly Asp Val Ser Gly Lys Gly Val Pro Ala Ala Leu Phe Met ala 
435 440 445 

ate gtc agg acc ttg ata cgc age gtg gcg gaa gaa gag cac gac ccg 1392 

20 lie Val Arg Thr Leu lie Arg Ser Val Ala Glu Glu Giu His Asp Pro 
450 455 460 

ggc gec ate gec acc aag gtg aac cac cgt ctg gec gag aac aac ccc 14 40 

Gly Ala lie Ala Thr Lys Val Asn His Arg Leu Ala Glu Asn Asn Pro 
25 465 470 475 480 

aag ctg atg ttt gtc acc ttg ctg ata ggc gtc ttc acc ccg gaa aca 1488 

Lys Leu Met Phe Val Thr Leu Leu lie Gly Val Phe Thr Pro Glu Thr 

485 490 495 



ggc gec ctg gec tgg gtc aac gec ggc cac ccg ccg ccg ctg etc ate 1536 
Gly Ala Leu Ala Trp Val Asn Ala Gly His Pro Pro Pro Leu Leu lie 
500 505 510 



35 gac gaa cgt ggc gag gtc cgc ctg ctt caa gga age age ggc gcg gee 1584 

Asp Glu Arg Gly Glu Val Arg Leu Leu Gin Gly Ser Ser Gly Ala Ala 
515 520 525 

tgc ggc gtg ctg gac aac gag gcg tat tec acc ctg age acc acc ttg 1632 

40 Cys Gly Val Leu Asp Asn Glu Ala Tyr Ser Thr Leu Ser Thr Thr Leu 
530 535 540 

ccg aac ggc acc teg ctg gtc gcg ttt acc gac ggc gtc acc gaa gee 1680 

Pro Asn Gly Thr Ser Leu Val Ala Phe Thr Asp Gly Val Thr Glu Ala 

45 545 550 555 560 

ate cac ggc ggc tgc gec cag tat ggt ctg ccg egg ctg gtc gec ctg 1728 

lie His Gly Gly Cys Ala Gin Tyr Gly Leu Pro Arg Leu Val Ala Leu 

565 570 575 



atg cag ggc gcg ccg cac gca gcg gec gaa etc ate gag cac att ctg 1776 
Met Gin Gly Ala Pro His Ala Ala Ala Glu Leu He Glu His He Leu 
580 585 590 



55 cac gac eta cgc gaa ttc gec gec gat tec gaa caa tec gac gat etc 1824 

His Asp Leu Arg Glu Phe Ala Ala Asp Ser Glu Gin Ser Asp Asp Leu 
595 600 605 

acc ate ate gec att cat cgc cca tga 1851 

60 Thr lie lie Ala He His Arg Pro * 
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<210> 82 
5 <211> 616 

<212> PRT 

<213> Bordetella pertussis 
<400> 82 

10 Met Asp Leu Val Val Arg Asp Thr Asp Thr Arg Trp Ser Thr Leu Leu 
1 5 _ 10 15 

Asp Asp Lys He Arg Thr lie Arg Glu Ser Arg Arg Gin Leu He Gin 

20 25 30 

Leu Ser Ala Val Val Thr Ser Val Leu Asn Ala Tyr Ala Ala Gin Ala 
15 35 40 45 

Glu Arg Gly His Val Thr Thr Gly Ala Ala Lys Gly Met ala Arg Val 

50 55 60 

Trp Leu Asn His Leu Asp Leu Gly Pro Arg Arg Val Ala Phe Ala Tyr 
65 70 75 80 

20 Asp Ala Glu Gly Thr Val Leu Ala Ser Thr Asn Pro Arg Met lie Asp 

85 90 95 

Arg Asp Leu Ser Gly He Arg Asp Phe Lys Gly Arg Pro Leu Ala Ala 

100 105 110 

Ala Met Tyr Glu Glu Ser Arg Asn Asp Gly Arg Gly Phe Ala lie Tyr 
25 115 120 125 

Pro Ser Pro Leu Asp Glu Ser Ala Gin Met Arg His Ala Tyr Phe Val 

130 135 140 

Tyr Phe Pro Ala Trp Lys Trp Val Leu Ala He Ser Asp Ser Ser Gin 
145 150 155 160 

30 Ala He He Asp Lys Val Ala Ala Gin Lys Ala Asn Met He Ala Ala 

165 170 175 

He Asp Arg Asn Leu Ser Glu Leu Arg Leu Ser Arg His Gly Phe Val 

180 185 190 

Phe Val Val Ala Asp Asp Gly Thr Val He Val Pro Pro Pro Pro Ser 
35 195 200 205 

Ala Ala Arg Leu Leu Asp Ser Thr Asp Val Glu Ser Gly Arg Val Leu 

210 215 220 

His Ser Met Leu Ala Glu lie Ser Ser Thr Arg Gly Leu Thr Leu Arg 
225 230 235 240 

40 Phe Thr Asn Gly Glu Ser Ala Trp Gin He Asp Ala Leu Arg Tyr Lys 

245 250 255 

Pro Leu His Trp Thr He He Gly Val Val Pro Glu Pro Asp Leu Thr 

260 265 270 

Asp Pro Ala Gin Asn Leu Val Arg Arg Gin Ala Leu He Phe Ala Ala 
45 275 280 285 

Thr Leu Leu Ala Gly Leu Met Leu Ala Trp Val Val Ala Val Arg He 

290 295 300 

Ala Arg Pro Leu Ala Gin Leu Ser Asn Tyr Ala Arg Gin Leu Pro Thr 
305 310 315 320 

50 Gin Asp Leu Thr Glu Pro He Arg Val Pro Pro Ser Val Ala Cys Leu 

325 330 335 

Pro Arg Arg Arg Arg Asp Glu Val Gly Gin Leu Ala Glu Ser Phe Leu 

340 345 350 

Phe Met Asn Glu Gin Leu His His Asn Val Arg Ala Leu Met ala Gin 
55 355 360 365 

He Ser Asn Arg Glu Arg Leu Glu Ser Glu Leu Ser He Ala Arg Ser 

370 375 380 

He Gin Leu Gly Leu Leu Pro Gin Pro Leu Pro Asp Ala Ala Thr Arg 
385 390 395 400 

60 Gly Ser Gin Leu Arg Ala Val Met Tyr Pro Ala Arg Glu Val Gly Gly 
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Amended Claims for PCT Application No. PCT/EP99/10297 



Claims: 



1. An isolated polypeptide comprising an amino acid sequence which has at least 75% 
identity to the amino acid sequence selected from the group consisting of: SEQ ID 
NO:42, 44, 46, 48, 50, .52, 54, 56, 58, 60, 62, 64, 66, 68, 70 and 72 over its entire 
ler.gth. 

2. The polypeptide as claimed in claim 1 comprising the amino acid sequence selected 
from the group consisting of: SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 
66.68, 70 and 72. 

3. An isolated polypeptide of SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 
66,68, 70 or 72. 

4. An isolated polypeptide comprising a fragment of at least 7 consecutive amino acids 
of the polypeptide as claimed in any one of claims 1 to 3, wherein the fragment 
comprises an epitope. 

5. The polypeptide of claim 4, wherein the fragment is immunogenic. 

6. An isolated polynucleotide comprising a nucleotide sequence encoding a polypeptide 
that has at least 75% identity to the amino acid sequence of SEQ ID NO:42, 44, 46, 48, 
50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 or 72 over its entire length; or a nucleotide 
sequence complementary to said isolated polynucleotide. 

7. An isolated polynucleotide comprising a nucleotide sequence that has at least 75% 
identity to a nucleotide sequence, encoding a polypeptide of SEQ ID NO:42, 44, 46, 48, 
50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 or 72, over its entire length; or a nucleotide 
sequence complementary to said isolated polynucleotide. 
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8. An isolated polynucleotide which comprises a nucleotide sequence which has at least 
75% identity to that of SEQ ID NO;41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61 ? 63 ? 65, 67, 
69 or 71 over its entire length; or a nucleotide sequence complementary to said isolated 
polynucleotide. 

9. The isolated polynucleotide as claimed in any one of claims 6 to 8 in which the 
identity is at least 95% to SEQ ID NO:41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 
67, 69 or 71 over its entire length. 

10. An isolated polynucleotide comprising a nucleotide sequence encoding the 
po: ypeptide of SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 or 

72. 

11. Ab isolated polynucleotide comprising the polynucleotide of SEQ ID NO:41, 43, 
45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 or 71. 

12. An isolated polynucleotide comprising a nucleotide sequence encoding the 
polypeptide of SEQ ID NO:42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 or 
72 , obtainable by screening an appropriate library under stringent hybridization conditions 
with a labeled probe having the sequence of SEQ ID NO:41, 43, 45, 47, 49, 51, 53, 55, 
57, 59, 61, 63, 65, 67, 69 or 71 or a fiagment thereof 

13. An expression vector comprising an isolated polynucleotide according to any one of 
chims 6-12. 

14. A recombinant live microorganism comprising an isolated polynucleotide according 
to any one of claims 6 - 12. 

15. A host cell comprising the expression vector of claim 13 or a subcellular fraction or 
a membrane of said host celL 
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16. A process for producing the polypeptide of claim 1 comprising the steps of 
culturing a host cell of claim 15 under conditions sufficient for the production of said 
polypeptide and recovering the polypeptide from the culture medium. 

17. A process for expressing a polynucleotide of any one of claims 6 - 12 comprising 
tninsforming a host cell with an expression vector comprising at least one of said 
polynucleotides and culturing said host cell under conditions sufficient for expression 
of any one of said polynucleotides. 

18. A vaccine composition comprising an effective amount of the polypeptide of any 
on e of claims 1 to 5 and a pharmaceutical^ acceptable carrier. 

19. The vaccine composition of claim 18 r wherein the polypeptide has an amino acid 
sequence selected from the group consisting of: SEQ ID NO:42, 46, 48, 50, 52, 54, 
56, 58, 60 and 62. 

20. A vaccine composition comprising an effective amount of the polynucleotide of 
any one of claims 6 to 12 and a pharmaceutical^ acceptable carrier. 

21. The vaccine composition according to any one of claims 18-20, wherein said 
composition comprises at least one other Bordetella pertussis antigen. 

22. An antibody immunospecific for the amino acid sequence of claim 1 or 2, the 
polypeptide of claim 3 or the fragment of claim 4 or 5. 

23. A method of diagnosing a Bordetella pertussis infection, comprising identifying a 
polypeptide as claimed in any one of claims 1 - 5, or an antibody that is immunospecific 
fo;r said polypeptide, present within a biological sample from an animal suspected of 
having such an infection. 
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24. Use of a composition comprising an immunologically effective amount of a 
polypeptide as claimed in any one of claims 1 - 5 in the preparation of a medicament 
for use in generating an immune response in an animal. 

25. Use of a composition comprising an immunologically effective amount of a 
polynucleotide as claimed in any one of claims 6-12 in the preparation of a 
medicament for use in generating an immune response in an animal. 

26. A therapeutic composition useful in treating humans with Bordetella pertussis 
disease comprising at least one antibody directed against the polypeptide of claims 1 - 5 
and a suitable pharmaceutical carrier. 

27- A kit for diagnosing infection with B. pertussis bacteria in a human comprising a 
polynucleotide of claims 6-12 or a polypeptide of claims 1-5. 

28. A method of identifying virulence genes from a pathogenicity island containing a 
type III secretion system from pathogenic strains of bacteria, comprising: 

designing degenerate PCR primers complementary to well-conserved regions 
specific to the LcrD polypeptide of Yersinia** 

amplifying the polynucleotide containing the DNA sequence between (and 
including the DNA sequence of) the primers of /crZMike genes present in said 
pathogenic strain of bacteria; 
sequencing the /crD-like gene; 

determining whether the DNA sequence is more homologous: to the virulence- 
associated family of /crZMike genes, or to the flagellar-associated family of 
fcrD-like genes; and 

if a virulence-associated member, sequencing the entire pathogenicity island,, 
and 

identifying genes within this sequence, 

29. A method of determining whether a particular bacterial strain harbours a type III 
secretion system involved in pathogenicity, comprising: 
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designing degenerate PCR primers complementary to well-conserved regions 
specific to the LcrD polypeptide of Yersinia; 

amplifying the polynucleotide containing the DNA sequence between (and 
including the DNA sequence of) the primers to determine the presence of any 
/crD-like genes in said bacterial strain; 
if amplified successfully, sequencing the /crZMike gene; and 
determining whether the DNA sequence is more homologous: to the virulence- 
associated family of /crD-like genes, or to the flagellar-associated family of 
/crD-like genes. 
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(54) Title: VACCINE 



(57) Abstract 



This invention relates to a general method for detecting pathogenic strains of bacteria which harbour a type III secretion system. 
More particularly, this invention relates to the methods as applied to the pathogen Bordetella pertussis. Furthermore, the invention relates 
to newly identified polynucleotides within these regions, virulent polypeptides encoded by them and to the use of such polynucleotides and 
polypeptides, and to their production. More particularly the polynucleotides and polypeptides of the present invention relate to the virulent 
effector proteins associated with the type III secretion system of Bordetella pertussis, which are particularly suitable for vaccine purposes. 
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Fig. 2 
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Fig. 4 
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Figure 5 
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Figure 5 (continued) 

1501 TCCTGGTGGC CGAGAACCTG GGCCACGTGA AGGCGGTCGC CGCCATGACC 
1551 GGACAGGACC TGGACCGCTA CGTGGGCCGC GCCTTCGTGG GCGACGGCGT 
1601 GGCGACCATG GTTTCCGGCG CCGTCGGCGG CACCGGGGTG ACCACCTACG 
1651 CCGAGAATAT CGGCGTGATG GCCGTGACGC GCATCTATTC CACGCTGGTG 
1701 TTCGTGGTGG CGGCCGTGAT CGCGCTGGTG CTGGGGTTCT CGCCCAAGTT 
17 51 CGGCGCGCTG ATCCAGACCA TCCCCGGCCC CGTGCTGGGG GGCATGTCGG 
1801 TCGTGGTGTT CGGCCTGATC GCCATCGCCG GCGCGCGCAT CTGGGTGGTC 
1851 AACCAGGTCG ATTTCAGCGA CAACCGCAAT CTGATCGTGG CCGCCGTGAC 
1901 CCTGGTGCTG GGGGCGGGCG ACTTCAGCGT CAAGCTGGGC GA7TTCTCGA 
1951 TGAACGGCAT CGGCACCGCC ACGTTCGGCG CCATCATCCT GTACGCCCTG 
2001 CTGGGCCTGG CGCGTCGCCG CTGACGGCGC GCCAACCCGG CACZGGGGCC 
2051 GCGCTCAGTG GGCCGCGGCC GGCGGCAGGG CGGCATGGCC GCCCGCCGGC 
2101 CCGGCCAGCG GCGCGGCGGC GACGTCGATG ATGGCGCCGC CCTGCGTCCT 
2151 GAAGACCGCC GCCGCGGCCG CCAGGTGGCG GGCCTGTTCT TCCATGGCCG 
2201 TGGCCGCCGC CGCGGCCTGT T CG AC C AG CG CGGCATTCTG CTGGGTGGTT 
2251 TCGTCCATTT GCGAAATCGC CAGGCTGACC TGGTCGATGC CGCTGGCCTG 
2301 CTGGGCCGAG GCGGCCGAGA TCTCCCCCAT GATGTCGGCC ACGCGCTGTA 
2351 CCGAGGCCAC CACCTCGTCC ATGGTGCCGC CGGCGCTGGC GACCTGTTGC 
24 01 GAGCCGGCGC GCACCGTCGC CACCGAGCTC TCGATCAGGG CCTTGATCTC 
2451 CTTGGCCGCC TGCGCGGCGC GCTGGGCCAG GCTGCGCACC TCGCCCGCCA 
2501 CCACCGCGAA GCCCTTGCCC TGTTCGCCGG CGCGCGCCGC CTCGACCGCG 
2551 GCGTTCAGCG CCAGGATATT GGTCTGGAAG GCGATGCCGT CGATCACGGT 
2601 GACGATGTCG GCGATCTGGC GCGAGCTCGC GGAGATGCCG TGCATGGTCT 
2651 GCACCACCTG CGCGACCGAC TCGCCGCCGC GCTGCGCCAC CTGCATGCTG 
2701 ACGGCGGCCA GCTGATTGGC CTGCGCGGCA TTGTCGGCGT TCTGCTTGAC 
2751 CGTGGTGGCC AGTTCCTCCA TGGTGGCCGC GGTCTCTTCC AGGGCGGCGG 
2801 CCTGCTCCTC CGTGCGGCTG GACAGGTTGG CGTTGCCGGC CGAGATCTCG 
28 51 GCCGCGCCGA CG TT G ATT TC GTCGACGCCG CGCCGCATGA CGGCGATGGT 
2901 GCGCGTCAGG CCTTCCTGCA TGCGCTTGAG CGCCGCGAAC AGC3CGCCGA 
2951 TTTCATTGGC CGAGCGCACC TCGATGCGCG CGGTGAGGTC GCCGTCGGCG 
3001 ATGCGGTCGA AATGATGGCC GGCCTCCAGC AAGGGGCGCA GCACCGCGCG 
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Figure 5 (continued) 

3051 GCGCACGAAC AGCCAGCCGG CCAGGGTCAG CAGCACGCCC AGCGTGG7CA 
3101 GCGCGATGGC GCTCCAGCGG GCCACGACAT GGGTGTCCTC GGCGCCGC7G 
3151 CGCACTTCGT CGCTGTGCGC CTGTACCCTG GCCAGAAAGG CTTCCATGTC 
3201 GCGCTGGAAT GCATGTTCGG CCTGCTGCGC GCGATGCATG GCGGCCAGGG 
3251 CGGGCTCGGC CTGGCCGGCG TCGACGGCGG CGGCCAGCTC GTCCAGCACC 
3301 GACTGGTAGG CGCGCCAGCG CGTCTGCAGG GTCGCGGCCA GCTCGGCCGC 

3351 GGCCGGCTCC TTGGGCACGT CCACGTAGCG CTGGAATTGC GTGGCCGCCT 

34 01 GGCGCTGCGC CGCGCCGCTG GCCTCGAACA GCGGGTCGAC CTGGTTGACG 

34 51 GCGACCTGGT TCAGCCCCTC GATCTCGGCG GCGCTCCTGC CCGCGTTGCG 

3501 CCACGTCAGG GCGCCCGACA GCATCAGCAC CGCCAGGAAG CACGCGAATA 

3551 CCGCCACCAT CGCGGTGCGG ACCTTGATCC TTTCAAGCAT GGGCATCTCC 

3601 TGGGTAGGGT CTTGCGCAAT AAACGTCGAC GACGTGCGCG GCGTACCCGC 

3651 GCGGGAAGTG TCAGGCCGTG ACGTGGAATC CGTTGTTGAT GTGGAATTCG 

37 01 ACTTAAATGA TCGGTATTCG GTATTAATTT GTTATTTGTA GT TAT AT AT A 

3751 CGAATATTCA TACCCGGATT TGCCCTAAGT TGGTGCGTTC TCGACGGGTG 

3801 CTTTCGATTG CCCGGGGCCT GCGGCGCATG AAGGAATGCG CGGCAACGCC 

3851 GGGCCCGCGC TGCGATCGGC GGCCAACACG CAGGTTTTGT GGCTTTTCCG 

3901 CAGCCAACAT GCGCCGAAAC CTACGCCGGG CTTACAGGCT TGCAATTCCG 

3951 GTGGACTTTG CCGACAATGT CATCTGATTG CCCCGGTTCC GACCCGAGCC 

4 001 GGGGTTTTGT TTTGGTCGAC GCTTGGCCGC CGGATGCGGC AGGCCGATCA 

4 051 AAGAGGAGAC AGCAAAGGGG AGCCTCGGTC GGGTTCGACC TTGTCTCGTC 

4101 TTTTGTTCGC CTGTCTTCCG CAGCGGCCCG CCTGTTTCAT GGCGAGGCCA 

4151 TACACCAAGC CGAGACCTTC ACCGAAACGC TCCGTCGGGG GCGTTTTCTA 

4201 CTTTTGTTTG GGAAACGACA TGTCTGCCAT TCCTTTGACC GTGCGCGGGG 

4251 CCGAGCGCTT GCAGCAAGAA CTGCATCGGC TTAAGACCGT TGAGCGTCCT 

4 301 GCGGTGATCA GCGCCATTGC GGAGGCGCGT GCGCAGGGTG ATT7GTCGGA. 

4 351 AAATGCCGAG TACGACGCCG CCCGCGAACG CCAGGGCTTC ATCGAAGGCC 

4 4 01 GGATCTCCGA ACTCGAGGGC ACGCTTTCGA ACGCGCACCT CATTGATCCA 

4 4 51 ACGGCGCTCG ACGCCGAAGG CCGTGCCGTG TTCGGCGCGA CCGTGGAAAT 

4 501 CGAAGACCTC GACTCGGGCG ACCGCCTGAC CTACCAGATC GTGG3CGACG 
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Figure 5 (continued) 

4 551 TCGAAGCCGA CATCAAGTCC AACCTGATTT CGGTCTCCAG CCCGGTGGCC 
4 601 CGCGCCCTGA TCGGCAAATC CGAGGGCGAT GTGGTCGAAG TG^J-.GGTGCC 
4 651 GGCTGGCGTG CGCGAGTACG AAGTCATCGG TGTGCGTTAT CTC7GACGCC 
4 701 GATTCCGCCC CCCTGCATAC CCATGGCCAA TGACCGACGC CG77TCCACC 
4751 ATCAGCCAGC CGCTTGCGGC CAGGCGGGGT GGCACAGCCG CAGCCGTCGC 
4 801 GCTGCAGGCG TCCCGGATGC GATATCCCGG GCTCCGGTCG TGGCCAGCGT 
4851 ATTCCCCTTC TCGGCCGTCA ACGGCTTGGG GGGCGCGGTA TTCA7CCGAA 
4 901 TTCCTGAACG CCGGGCCGTG CGCCGGGCGT CTGTATCGTT GTCGCGCCGC 
4 951 GATGCACGGA ACGTGCCGTC GGGCGCCGCA ACGCCGGGCC GCGCCGCC7A 
5001 GGTGTGAACT GTCAATAGGT TGTATTCGTC CAGGTTGAGT CTGGAGATGG 
5051 GTACAGCGCG CCCGATGCCT TGGTGGGGTC GATGCCAGTT GTAG7GGTGT 
5101 AGCCAGGATT TCATGGCATC GGCTCGGTGT TGGGAGTTCT GGTAGGTGTG 
5151 AGCGTAAGCC CACTCACGCA AGGCCGACTG GATGAAGCGT TCGGCCTTGC 
5201 CATTGGTCTG TGGGCGGTAA GGTCGGGTAA AGCGGTGCTT GATGCCCAGC 
5251 TCATGGCACA GCGCGGCGAA GGCGCGGCTG CGAAAGGCCG AGCCATTGTC 
5301 GGTGAGCAAG CGCTGGATGG TCACGCCCAG GCGCTGGTAG TAGGCCACTG 
5351 CGTCCTTGAG GAACTGGACG GCGCTGGGGA AGCGCTCGTC GGGGTGGATG 
54 01 TCGGTGAAGG CCACGCGGGC GTGGTCATCG ATGGCCACGA AGACGAAGTC 
54 51 CCAGCCGGCC CCCTCAACGG TATCGCGTCG GTTGCCCGTG ACCCGGTGGC 
5501 CAGGGCGCTG GATACGTCCC AGCTTCTTGA TGTCGATGTG C AG C AG AT CG 
5551 CCGGGGGCCT GATGCTCGTA GCGCACCACC GGCTCGGCCG GCTCCAGGTC 
5601 GGCCAGGTGC GACAGACCGG CGCGGGCCAG GACGCGGCTG ACGGTGCTGG 
5651 CTGACACGCC CAGCGCCTGG GCGATGCGCG CTTGGGTCAG CCGC7TGCGG 
5701 CGCAGCTCCA CGATAGCCAG CGCCTTGGCC GGCGCAATCG CTCGGGGCGA 
5751 GACCGTCGGG CGCGAGG AC G CATCGGCCAA GCCCGCCTGG CCCTGAGCCA 
5801 GGAAGCGGCC CAGCCATTTG CGCACAGTCG GCGCGGTGAC CCCATAGGCG 
58 51 CGGGCCGCTT CAGGCACACA AACTTGATGG GCGATCAATT GCTGGACCAT 
5 901 TTCGAGTCGA CGTAGGAAGG TCAATCGGGC ATGCTTATGG GTG77CATCC 
5951 GGCCGGGCTC CTTGAGTGAA CTGGGGGGGT GGCGATTTCC AGT77CTCAA 
6001 ATCCGGTTCG GATGAACCAT GCATACAACC TATTGAATCT TCACAACTAG 
6051 CGCGCGTGGC GCGGAAAGAC CAGCAGGTCG GCCGTCACCG GTTCCCTGTT 
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Figure 5 (continued) 
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Figure 5 (continued) 

7 601 GCGCTGGTTG AACGCGGCCA CCAGCTCGCG CAGGCGCGCC ATGCGCCCTG 

7 651 CCGCCAATCC GCCGGGATCT GCGTCCAGGC GGTCGGCGTG CCAGCTGAGC 

7 701 TTGACGCCGT CGAGGCGTTC GTCGGCCAGC TGGGCCGCGA ACTGGGCCGA 

7751 GACCTCGTCG GCCAGGCGTA CATCGCGACC GAGGATCGTC ATGCCCGGCA 

7 801 GGCGCATGCG CACCGCATGC AGCGCCGCGG CGCGTTCGTG CGCATCGCTG 

7 351 GCGATGCCCG AGATCGCCAG GCGGCCATTG CCGTACGGGC GCGCCATGTA 

7 901 GCGCACCCCG AATGTCGCCA GGACATCGCA GGCCAGGGCC CTGGCCTCGT 

7 951 CCTGCCTGCT TACCTGCATG GCAGGCCGTG GCGCAAGTTG CGCCAACGCC 
8001 CTGGCGACCC GAGCGAATTC CGTCTCGTCG TGCACCCATC CGGTCACGGT 
8051 GAGCACGCCG CChCGGCCGT AGGCCGCTTG TAATTGCTCG GTAAGGCCCA 
8101 GGCTGTCGAT GAGCGCCGCG GCGCGGACCA GCGGCGCGGT GGGCGTTGGG 
8151 GGCGGCGCGG CCGGCGGCGT GGCGGGTGTG GTCACGGAAA CCAGCGCCGT 
8201 GGCCAGGCCG ACCAGCAGGA CGGCCGCGGC CGCGCCCAGC GCCAGCCAGG 
8251 GCCGTCCTGC ACGTCGGCGC GGCATGAGGG CAGCGACGGA CGGCGGGCTT 

8 301 GTCGAGCCAG GGACGTCGTG CAAGGCTGTG TCGCTGCCGT CCGGGCCGCA 
8351 CGGCTCCGGC GGCGCGGGCC ACGGCGCGGA AGGGGCGGCC ACGGTGATCC 
84 01 AGGCGGCTCC CAGCTCTACG GGTTCGTTGA AGGCCGCGGG CGGhCACGGC 
84 51 GCCTGGGCGT CCAGGCCGGG CGTCACGGCG CCGGCCAACC GCCAGCCGGA 
8501 CTGGTCGATC TCCAGCCATC CCGCCACTTC GGGCATGTCC TCGCCGGTCA 
8551 GGACGATATC GCAATGCGGA TTGGCGCCCA CGCGCGCGCC ATGCACGGCC 
8 601 GGGCAGCGCG CCATGCACTG TGCGCCTGAA AGCACGCGGA ATTCCAGCGC 
8651 CGTCGTC ATA GATCCACCCT GCCCAGGGGC TGTACATTGA TCTCCGGCGT 
8701 CAGTTCCTGG TAGGACAGCA CCGGCAGGGC GTAGAGATCG GCTTCTATCA 

87 51 TCTTGCGCGT GTAGCGCCGG ATGTCCATCG ACGTCAGCAA GACGGGACGG 

88 01 CTCGCGCCGG CGGCCAGATC GCCGACACAT TGACGGATGT GCTCGACCAG 
8851 TCGGCGTGTC GTGTCCGGAT CGAGGGCGAG ATAACTGCCG GCGGCGGTCT 
8 9C1 GCCGGATGGC GGCGCGCACG GTTTCCTCGA CCTTGGGGGC CAGCAGGTAG 
8 951 GCGGGCAGGA TATTGTGGCC GCTGGTGTAC TTGTGGCTGA TATAGCGCTT 
9001 GAGTGCGATT CGGACATACT CCGTAAGCAG GACGGTATCC TTT7CCTTCT 
9051 GGCCCCATTC GACCAGCGCT TCCAGGACGG CGCGCAGGTT GCGTATCGAC 
9101 ACTTCTTCGG AAACAAGGCG CTGCAGGATT TCGGCAATCT TCTGCACCGG 
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Figure 5 (continued) 

9151 CATGACGCGC AGGCACTCCT TGACCAGATC GGGAAATCGT TCTTCCATGG 

92 01 CCGAAAGCAG AAACCGGGTT TCCTGGATGC CGATGAAATC GGCTGAATAT 

9251 TTTTTCAATA CATATGCCAA GTGCCAAGTC AGGATCTGGC TGATACCCAG 

9301 GTAAGGAATA CCTGCATCGC GCAAGGCGCC GGTCAGACTG GCCGCAACCC 

9351 AGATCGTGGG CGTATCGGGC AGAAAGGCCG CGCCCGTTTC GTATGCGATC 

94 01 CGCAGGGCCT GCAGGTTCTG CTCGGTGTCC CGCACCAGCA CGGCATCGTC 

94 51 GCGCAAC ATT CCTTGCGCCA CCGGGATCTC CGACAGCACG ATGGTGTAGG 

9501 TATTGGCGGC CAGCGCTTCG GTGAAGCGCA ACTGGATGCC GGGAAACGGC 

9551 ACGCCCAGGT CGAAATAGAG CGCCCGCCGG ATCTGCAGCA GATCGTCGGT 

9601 GAGGGTGGCC GGCTCGAACC GGGGCTGCAG CCGCGCGGCT ACGTCGATGA 

9651 TCAGCGGGAC GGTGGGGGCG AATTCCGCCT GCCCATCCGC CGGCGCGCGG 

97 01 GTGCGGGGCT GGCCGTCGGC AGCCATGCCG GCGAGCGCGG GCTCGGCGCC 
9751 TTCGGGCGGA CGCTGGGATG CGCGCAGCAG TACGAAACCG ATGGTGCCCA 

98 01 CCGCGGCGGC CAGGGCGAAG AAGACCAGCG TGGGCATGCC GGGAATGAGG 
98 51 CCCAGGCCTG CCGAGATCGC GCCGGCAATG ACCAGGGCGC GAGGCTGCGC 
9901 CAGCACTTGT GCGCCGATGT CGGTGCCTAC GTTGGAGGGG CCATCCCCGG 
9951 TCTGCACCCG CGTCACG ATG ATTCCGGCGC AGATGGCGAT GAACAGCGCC 

10001 GGGATCTGCG CGATGAGCCC GTCGCCTATG GTCAGGATGG CATATGTCTG 

10051 CACGGCCTCG CCGGCGCTCA GGCCGCGCTG CAGCACGCCG ACCAGCATGG 

1010X CGCCAAGCAG GTTGACGGCA ACGATGATCA GGCCGGCGAT GGChTCGCCC 

10151 TTGACGAACT TCATCGCGCC GTCCATGGCG CCATACAGTT GGCTTTCCTT 

10201 CTCGACCGTA CGGCGTCGGC GTCGGGCTTC GTCCATGTCT ATGGTGCCCG 

10251 CGCGCAAGTC CGCGTCGATG GACATCTGCT TGCCGGGCAT GGCGTCCAGC 

10301 GAGAAGCGCG CGGCGACTTC GGCCACCCGC TCCGCGCCTT TGGTGATGAC 

10351 CACGAACTGC ACGATCGTGA GGATGAGGAA AACCACCAGG CCGACGATCA 

104 01 GGTTGCCGCC CACCACGAAG TTGCCGAAGG TCTCGATGAT GTGGCCGGCA 

104 51 TCGCCTTGCA GCAGGATCAG CCGCGTGGTC GCGATGGAGA TGCCCAGCCG 

10501 GAACAGCGTG GTGACCAGCA GGACCGAAGG GAACGAGGAA AACGCCAGGG 

10551 GCGAAGGCAG GTACATCGCG ACCATCAGCA GGACTGCCGA CAGCGTCATG 

10601 TTCGCACCGA TCAGCACGTC GACCAGCGTT GTGGGCAACG GCAGGATCAT 
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Figure 5 (continued) 

10651 CATGAAGACG ATCGCCACGA TGAGCACGGC CAGTACGATG TCGTTGCGGC 

10701 TGGTGGCCAG CGCCACCGCG CGTTGCAGGC GGCGAATGGA TTTCTTGCTC 

10751 GTCATGGCGG GGTCGCGCTC GCGCAAGACG CCGCCCGTAG CGCCATA7AG 

10801 TCACG CAT GG CCTGGCGGGC GTCGTCGGGT CGGTTCAGCG CCTGCATGGC 

108 51 CTGCGCCCGG ACCAGATGGC CGGCGGCATC GGGTGTGGCG CGCAGTGCGC 

10901 GCTTGTCCAG CGTGACCAAG GCCATGCGCG GTTCGCCCTG GTGCAGATAG 

10951 CCCAGCGCCA GGGCCAGCAG GGACTGGCTG TCGATCGCGT CCAGGGCA'C 

11001 CAGGGCCGCC AGCAGGGCAA CCGTCTTGCT CCATTGGCGC TGCAACTGG? 

11051 AGTGGTGCGC AAGCAATTGC AGAAGTTCGC GTACCTGAAG GCTGGGCGAG 

11101 GGTAGGGTAT GCGGCATCAT CCCTGGTAGA GCGCGCTGCG ATACATGGCA 

11151 GCCAGGTCGC GCAGCTTTCC GGCCTCGTTG AGTACGTGCA AGGCGCGGCT 

11201 TAAGGCTGGC GCGGTGTTGC CGGCATGCAG TTCCATGGCC CGGGAC AG T T 

11251 CGTCGCGCGC CTGGGCCAGG GCGCGCTCGA ACTGCGCGGG CTGGAATAAT 

11301 GCGCCATCCG TCAGGCGAGG CCGCGCCGAC TCGTCGAGCA TCGCGCTCAG 

11351 GTCCGGGCGT ACGAGCAGGG CTTTCAGATG ATCGACCGCG CCGGTGGCGG 

114 01 GCGGTTCGAG CCAGCGCTCG GGTGGCAGGG TGGGGGCGGG CTCGCAGCGG 

11451 GGACCGCGCA CGATGTGGTC GACGCCGCGC TCCAGGCCGA GGTGCATGGC 

11501 ATGCATG CCG GGT ACGGCGC TGGACATGGC GTCACGCCTC CAGGCGGTCG 

11551 AGCCAGTTCG TCAGGCCCAG CACCGCCAGG CGCAACGCCG CTGCGTCGAG 

11601 CGAGCGTTCG GGCAGTCGTG TCTGCGCGAC GATCCAGTCC TCGCTGCCCT 

11651 CCGACCAGAG TGACGTCTGG ATGGATGCGG CGCTTCCGCG CTGCCCATGG 

11701 GCCCGTTTCC ATGCCGCCAG CAACACGGAG GCGGCGTCGC GCTCGACCCG 

11751 CTGGGCCAGG TGGACCAGGG CCGCGCCGGC GACGCATTCG ACGCCCAGGC 

11801 GGCGCCCGTT GGACAGCGCC AGCGACGCCG ATCCGGACGG CCCGAATGCC 

11851 AGGCCCTCGA TGCCGATATC CTGGCCGAAC TGATGCAGCG CCCTATCGGC 

11901 AGTATTCATG CGTTCTCCAT TGCTATCGCG TTGTCCAGCG CGTCCTGCGC 

11951 GGCCGCCAGG ACGGTGGCGC GCACGTCCAT GTCGGCGTAG ATCTGCGTGG 

12001 GCAGGTCTTT GAGAATCTGG CGTACGCCAC CGAGGAATGC GATGCGCTCG 

12051 GAGAGGGCAT TCGCACCGTG GCGCTCGGCC AGCTTCTCGA AGCGCGCGGG 

12101 CGCAATCCAT TTGTCCTCGC TGATTCCCAC AAGATCGCGC ATCAGGCCCT 

12151 GGGCGTCCGC ACACTCCTGC GAGCCTGCGT TGCCCAACCG TTGTTTCAGG 
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12201 
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13401 


AATCCGAAGC 
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5 (continued) 

CGTGGCGGCC ACCTCGACTT GAT AG AG A T C 
TGACGCCGTC CGTCGACGGT GTCGCCGC3G 
TGAATCAGCG CGCCCAGCGC GCCGTGGA7G 
CAGCACCAGG TCCAGCGTGC GCGCCAACGA 
CGCGGTACGC GTGCTGGAAG CCGGCCAGCT 
GCGCCGGCCG TGGGCAGGGT GTTGATGCCG 
GGCGAGCTCC AGGTCGGCCA ATGCATCGCG 
GCGCGGCGTC CTCGTGCTCG CCGCGCTGCA 
TATTGCTGCG TGACACCGGG AAACGCTTGC 
GCCCCGGCCG CGCAGCAGCT CGGCGGTCA.G 
CGTCGGGGTC GTGGGTGTGG GAAAACAGTT 
AGCCAGAGCA TCGGACGTTC GGCCGTGACC 
TTCCTCGGCA GCCTGCGCCA TGTGCAGGOT 
CC AG C GAT AT GCCGGTGGGC GCCGGTGCGA 
CCGGAGGAGG TGTTGGCCGA GGCGTCGTGG 
GAAGGGATTG GGGGCGGCAT CGATACGAGT 
GGAACCATTT GCCTACTGGT GCAGTGAGTG 
CCCGGAAACG GGCGCGATAT TGGGCAATTC 
GGCGCAGGGT TACTCAGCAT GCGTCTTTCA 
GCATTGATCT CGGAGTTTCA CTCACGTCGC 
ATCGACCTCA AGAGC AT GG A TATCCAGACT 
TCGTCGCGCC GAACTCCTCA CGGCTCAAAT 
TGCAGAAGGC CAATGAACGC ATGGCGCAGC 
CTGTCCCGGG CCAAGGCCGA GTTTCCGCCC 
CATCCCGGGC TGGGACAGCC AGAAGATCAG 
ATGATGCGCT GCGTGCCGCC GGCCTGACGG 
GGCCGGGTGA CCGGCCCCGA CGGCCGGGGT 
GGGCGTCATG GCCGGTTCCA CGACCTATAA 
CCACCGTAAA GGGGATGCTG GATACGGCGT 
ATGATCAGGC TGCAGGCCGC CAGCAACAAG 
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Figure 5 (continued) 

13701 CGCAACGAGG CTTTCGAGGT CATGACCAAC ACCGAGAAGC GGCGCAGCGA 

13751 CTTGAACAGC TCCATCACCA GCAACATGCG CTAAGCGCTG CACAAGGAGT 

13301 ATTCCATGCA GGAGCAAGGC ATCCAATCCA TCATGCGCGC CGCGGAAGAG 

13851 CTGGTCGAGC AGACCCGCCA GGCGTTGTAC AGCGTCGACG AGA7CTACGC 

13 901 CCACGTTGGC GTCGACCCCG CTCGCCTGCG CAATCTGGCG GTCGAGCAGG 
13951 CCAGGATAGA GGCCGAGGCC CAGGCGGCGT TCCGTGATGA CC7CGCGGAC 

14 001 ATCGAGCGCG AGGCGGCGCG CGTCAAGGC'G GCCTGCACCG ATGCGCCGCA 
14 051 GGCCCGCAGG GTGCTTCACA ACCACGTCTG AGCGCGGAGG CC77CCATGC 
14101 CAAAGTCAGC CGACCAGGGC GGCTCCCCGG CGTCAGCTTC GCATGAGGCG 
14151 TTGCGCCATA TTCTCGACGC AGGCGCTTCG ATGGGGGGCT TGCAGGGGTT 
14201 GGACGAGGCG CAGCAGCAGG CGTTGTACGC GATCGGTCAT GGCGCCTACG 
14251 AACAGG GGCG CTATGCCGAC GCGTTGAAAA TGTTCTGCCT GC7GGTCGCG 
14301 TGCGATCCGC TGGAAGCCCG TTATCTGCTG GCCCTGGGCG CCGCGGCCCA 
143 51 GGAGCTGGGG CTGTACGAGC ATGCCTTGCA GCAATACGCG GCCGCGGCGG 
14 4 01 CTTTGCAGTT GGACTCCCCC AGGCCCCTGT TGCATGGCGC CGAGTGCCTG 
14451 TATGCGTTGG GTCGTCGCCG CGACGCCCTG GATACGCTCG ACATGGTGCT 
14 501 TGAGTTGTGC GGCTCGCCGG AGCGTGCGGC CCTGCGCGAA CGGGCCGAGT 
14 551 TGCTGCGCAG GAGCTATGCA CGTGCCGACT GAAACGGCGC CATGTCCGCC 
14 601 GTCAAGATTT CAATTCGAGG AGGTTCGATA TGTCTGTTTC TCCGACTTCG 
14 651 CCCGGCTCTT TCGGGGCCGG CCCTGTCTTT GACTCCGAAT TGCAGGCCCC 
14701 GGCCCCGTCG GCGCAGCGTC GCGGCGGTGC GGCGCCTGTG CCGCCGCCCG 
14751 TCGATCGGCG CGGCGTCGAG CCGGGAGATC CCACGCTGGG CATGCTGCCC 
14 801 GCGCCAGATT TGCTCGCGGG GGGCGCCGTC AGCCGCACCC GCGCGGCGCT 
14851 CGACGATCTG GACGCCGCAC GGCTCGGTGA AGACATCTAC GCCTTGATGG 
14 901 CGGTGTTGCA ACAGGCCAGT CAGCAGATGC GGGACGCCGC CCGTATCGCT 
14 951 CGTGATGCCG AGGCTACGCG GCAAACGCAG GCTCT CGGCG ATGCGGCCAG 
15001 CCAGATGCGC CAGGCGGCGA GCGAGCGCAT GGCCGGAGCG ATCGTGGCGG 
15051 GCGCCATGCA GAT AGCGGGT GGTTTCGTGC AGCTGGGGGC GGGCCTGGCA 
X5101 GCGGGTTTGC AGGCCATGGG TGGCGCTGCT GCGCAAG CC A AGGGCGCCGC 
15151 ATTCTCCGAG CAGGCCTCGA CAAGCCGCAA GGTGGCGGCC GGCTTGCACG 
15201 ATGCCCCCGA GCTGCAGGCA ACGGTGCAGG CCCGCGCAAC CCAGCTCGAA 
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Figure 5 (continued) 



15251 GCGCAAGCGG CCTCGTTTGG TGCGGACGCG GCTCGTTCGT CGGCAAAG7C 

15301 GCAGCGCGTA TCGAGCGTTG CCCAGGCCGG CGCCGCAGCG GCCGGCGGTA 

15351 TCGGCGGCCT GACCAGCGCC GCCCAGGAAC GCCGCGCCGC CGAGCACGAG 

154 01 GCCAGGCGCG CGGAGCTGGA CGTCGAAGCG AAGGTGCATG AAACGGCC7C 

15451 GCGGCGCGCC GACGAAGCCA T G C AG C AG AT GCTCGACATC ATCCGCGGCA 

15501 TCAGGGAAAA GCTGGCCGGG AT GG AG C AG T CCCGCAGCGA GACCGCCCGT 

15551 AGCGTGGCCC GCAATATGTG AGTGTCCGGC TCCAACCTTC AA7CTTGAGG 

15 601 ATGACCGTCA TGAGTACGAC CATATCCACA GCCCCGAGCG GCGCCGCGCT 

15 651 TGCGCCGTCT CGCATAGATA TGCGGGCGCC GGAGCCCGGG AGTGCCGGCG 

15701 AAGGCGCCGG TATCCTGGCG CCGGTGACGA CGCTGGCTCT GGCGGCGGGC 

15751 CGGCCGGCTT TGCCAGCGTC ACCGTCGCTG CGCACCGCGC CCG7CCTGGA 

15801 TCCGCCAGTG CGCGATCTCA GCCCCGCCGA CTTGGCCGAC CTGCTGCGCG 

15851 TCTTGCGATC CAGGGCGGTG GACGGGCAGT TGGCCACGGC GCGCGAGAAC 

15 901 CTGCAGGATG CGCAAGTCAA GGCGAAGCAG AACACCCAGG CCCAGCTCGA 

15 951 CAAGCTGGAC GCATGGTTTC GGAAGGCTGA GGACGCCGAG AGCAAGGGCT 

16001 GGCTG AGCAA GGTGTTCGGC TGGATCGGGA AGGTGCTGGC GGTCGTGGCA 

16051 TCGGCCCTGG CTGTGGGCTT TGCTGCCGTC GCCAGCGTGG TCACCGGCGC 

16101 GGCGGCCACG CCCATGCTGG TGCTCAGCGG CATGGCATTG GTCAGCGCCG 

16151 TGACATCGCT GGCCGACCAG ATATCGCGAG AGGCGGGAGG GCCGCCTATC 

16201 AGCCTGGGCG GGTTTCTCTC CGGGCTGGCC GGACGTCTGC TGACAGCG7T 

16251 GGGGGTGGAT CAGTCGCAGG CCGACCAAAT TGCCAAGATC GTCGCCGGCC 

16301 TGGCCGTGCC CGCCGTCTTG CTGATCGAAC CCCAGATGCT GGGCGAAA7G 

16351 GCCGAAGGCG TGGCCAGGCT GGCGGGCGCC GGCGATGCCA CCGCGGGATA 

16401 CATAGCCATG GCGATGTCCA TCGTGGCGGC GATCGCGGTC GCCGCGA7CA 

16451 KTGCCGCCGG TACGGCCGGC GCGGGCAGCG CCTCGGCGAT CAGGGGTGCC 

16501 TGGGATCGGG CCGCCGCGGT AGCCACCCAG GTCCTTCAGG GGGGTACGGC 

16551 AGTGGCGCAA GGCGGCGTCG GCGTGTCGAT GGCAGTCGAT CGCAAACAGG 

16601 CCGATCTCCT GGTCGCCGAC AAGGCGGATC TGGCGGCGAG CCTGACAAAA 

16651 CTGCGGGCGG CCATGGAGCG TGAGGCGGAC GATATCAAGA AGATCCTGGC 

16701 TCAATTCGAC GCGGCCTATC ACATGATCGC GCAGATGATC ACC~ACAT" 

15/28 



WO 00/37493 



Figure 5 (continued) 

16751 CGAGCACGCA CAGCCAGGTC AGCGCCAACC TCGGACGGCG CCAGGCGG7G 

16801 TAGCGCCGGG CGCTCAAGGA ATTTTCATGA CTGTTCATGA CGACGCGC-CC 

16851 GCGGCGCTGC GCGCCCGGCT GGATGCGTTG CCGGGCAGCC GGCGCCTG-Z 

16901 AGCCGAGCAA TTGGAAGTGA TTTACGCGAT GGCGTATGCG CACGTCGCCA 

16951 GGTGCGAGTA CGGCAAGGCG CTGCCCATTT TCGCCTTCCT CGCGCPlGZI.C 

17001 GGCCCCACGC GCAAGCACTA CTGGGCCGGC CTGGCGCTAT GCC7GCAGAA 

17051 GACCGACCGT CCCGACGAGG CGCGCAATAT CTATGCGTTG ATCCTCACGC 

17101 TCTATCCAGA TTCCGCGGAT GCCGTGTTGC GCACGGCCGA GTGCGAGC7G 

17151 GCGTTGGGTG AGAACGAACG GGCGCAGGCG GCCCTGTTCG GCGCAATCGC 

17201 CATCGATGCA GAAAGTGGGC AGCCAGGTCC GGTCTCGCAC CGTGCGCGCG 

17 251 CTTTGCTCGA TCTTATTTCA GTTTCACATC CGGAGTAACT CCATGCAC7C 

17 301 AGACTCAGGT TCAGATTCAG GCTCAGACTC AGGCTCAGGC TCACCCATGG 

17 351 TCTCGTCGAT AC AT CCATCG GAACCGATAC AGCCGATGGA GCATGTGCTC 

174 01 GAGGAGGCCG ACGCCCGCCT GCTTACCGAA GTGGGTTTTC TGGCGGCGGC 

17451 CGTCAGCGAT CTGACGCGCG CGGACGCCAT TTTCAATGCA TTGCAACGTG 

17501 TACGGCCGGG CCGGACGCAT CCCTGCATCG GCCTGGCGGT CGCCCGCATG 

17 551 AACGCCGGGC TGCCCGACGA AGCCGCCGAG ATCCTGGCGA ATTTCCAGCC 

17 601 GGCACAGCCG GAGGACCGCT CGGAACTGGA CGCCTGGTGC GGGTTCGCTC 

17 651 TGTTGCTGGC TGGCCGCTCG GACGAGGCGC GCCGCATGCT GCAGCGAGCC 

17701 ATCGATGCGG GTGGCGAGGC GGCAAGGCTG GCGCAGGTCG TGT7GGACAG 

17751 CGGACCCGCC ATGATGCGGC CCGCGCCGTT GCAGTCCGAG CCATTACCTG 

17 801 GAGCT CCTGG ATGAATTTGG ATCTGACGGC GATCAACGCC GTGCAGGAAC 
178 51 GGCTGCTCGC TGGATCATTC GACATGCCGC GATCTCCCGC GATGGCGGAT 
17901 CAGGCGCGCT TTGAATTGGC GCTGGGCGAG ATGCCCGGCG CATCGGCCCC 
17951 GAACGGGGCG ATCGCCCTGG CGCCGGTCGC GCTCGACGAG CCGCTGGGCC 
18001 GTCGCATTCT TGGACAGTTG CGCGGCGGCC TGGCCGATGT GGCAGGAAAA 
18051 TGGCGGGCGG TGCAGACGGG CTTGGCCGAG GTGAGCCAGG CGCCTACCGT 
18101 GGTGGGTATG CTCGATCTGC AGGCCAGGTT GCTACAGGCA TCCGTGGAGT 
18151 ACGAGTTGGT GGGCAAGGCA ATAGGGCGCG CCACCCAAAA CGTCGATAC G 

18 201 CTGGCGAGAA TGTCATGAAC GCCATCGGGG CG AT CCAACG GTATCGGCGC 
182 51 GGCGCGGGAT GGGCGGCCCT GGTGCTCGCC CTGGCGCTGC TGGCCGGCTG 
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Figure 5 (continued) 
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CGGTGCCCGC 
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GGATCGGTCC 
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18901 


GCTGTTGCTG 


CTGTTGTTGC 


CGGTGCTGTG 


CCTGATAGTG 


GCGGGGGCCG 


18951 


CGCTCTACGT 


CTGGCGCACG 


CGCTGGTCCC 


GCGGCGAAGG 


GCGGGGCGGC 


19001 


GCTGGCGCCG 


GCGCCACGGA 


AGGAGCCGGG 


CATGACTGAG 


GCGAGCGTGC 


19051 


TGCTTTCCGA 


GCGGCTCATG 


ATATTCAATC 


TCCTGCCCAG 
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19101 
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GCCACGACGA 


GATGTTTCCA 


GCCGATTGGG 


TGCGCGCGTT 



19151 GTGCAATGCC GACGCGGCGT TGGCCAACGC GTGGCATCGC CAT7GGTCGC 

19201 GCTGGATCTT GTGCGAATTG GGCCTGCTGA ACCAGCCGGT CCTGAGCC7C 

19251 GATCCGCCGC AGTTGAAGGT CGCGCTATTG TCCACGGACG CCT7GCGGAC 

19301 CTGCGCCGCC CATGCGGGAG CGCTGCTGTG CGCGCCGCGC CTGCGACGCG 

19351 CGATAGACGG CGCCGAGGTC CGTACCTTGC ATGCCGCGCT CGGGCGCGAT 

19401 GTGATGAATT TCGCCGTGTC TTCCGCGGCG CGGGCCCTGC ATGACGGGCT 

19451 CGCCGCCAGT TCGGACTGGA CCCTGGCCGC CACGGTCCAG GCGGCGCAGA 

19501 AACTGGGCTG GGCCCTGCTG CGCGACGCCG TGCAGGGCGC CGCCGACGAG 

19551 ATAGCGCTGC GTTGCGCGCT GAAGTTGCCG CGCGACCTTG ATCCCGCGCC 

19601 CGTCCTGCCG CCCGAGGCGG CGCTTGCGCT GGTGCTGTCC ATGCTCGAAA 

19651 TCCTGGATGC AGAATGGCTT TCCTCGTTCC CCGCCCPJKGC CTGATCCAGG 

19701 CGGTACGGCC CGGCCGTGCG GATCCCGCGA CCGACGTCTT GCGCGCTGAA 

19751 GACTACGCCG AGCTGCTCAG CGCCGCGCAG ATCGTTGCCC AGGCACAC7G 
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Figure 5 (continued) 

19S01 GCGGGCCGGC GAAATCGTGG CCGAGGCGCG AGAGGAGTTC GAGCGCGAGC 

19851 GCAGGCGAGG CTATGAGGAG GGGCGCCGCG AAGCGCTTAC GGA7CAGGCG 

19901 GAGAAGATGA TAGAAACCGT AAGCCGCACG ATCGACTACT TCGCGGGTAT 

19951 CGAGAACGAG ATGATCGAAC TGGTCATGAG TGCGGTCCGC AAGATCGTCG 

20001 ACGGTTACGA CGACCGCGAG CGCACCGTGA TCGCCGTGCG CAACGCAT7G 

20051 GCGGTCGTGC GCAATCAGCG CCAGATGACC TTGCGCCTGC ACCCAGACGA 

20101 GGTGGATGTG CTCCGGGAAG GCATGAACCA GCTTCTGGCG GCC7ATCCGG 

20151 GCGTGGGCTA CCTGGACCTG CTGCCCGACG CCAGGCTGGC GCCGGGAGCC 

20201 TGCATACTGG AG AGC GAG AT AGGCATGGTC GAGGCCAGCC TCGAGGACCA 

20251 GCTGTGCGCC TTGCGGGCGG CCTTCGAACG TACATTCGGC CGGCGCGGAT 

20301 AGGGGCATGC GTCAGTACCA CTACATCACG GAGATGATGC GGGTGGCCCT 

20351 GCAGGATCTG TCCACGCTGC GGATAAAGGG CCGGGTGGTG CAAGTGGTGG 

20401 GAACGAT CAT CAAGGCCGTC GTTCCGATGG TCAAGATCGG CGAAGTGTGC 

204 51 CTGCTGCGCA ATCCCGGCGA GGACTTCGAG ATGCACGGCG AAGTGGTGGG 

20501 CTTTGTCCGC GACGCCGCCT TGCTCACGCC TATCGGCGAC ATGTACGGGA 

20551 TTTCCTCGGC GACCGAGGTG ATACCGACCG GACGCACGCA TATGGTCCCC 

20601 GTCGGTCCGG GCTTGCTGGG ACGCGTGCTG GACGGGCTGG GACGTCCGCT 

20651 GGACGCCGCC GAGTCAGGGC CGCTGCATGC CCACAAGTTC TATCCGGTCT 

207 01 TCGCCGATGC GCCAGACCCG CTGACGCGTC GCATCATCCA TGCTCCGCTG 

20751 GAGCTGGGGG TGCGCGTACT GGACGGTTTG CTTACATGCG GGGAAGGCCA 

20801 GCGTCTGGGA ATTTTCGCAG CCGCCGGCGG CGGCAAGTCG ACCCTGCTGG 

20851 GCATGCTGGT CAAGGGCGCC GCGGTCGACG TGACGGTGGT GGCGCTGATC 

20901 GGCGAGCGTG GGCGGGAAGT TCGCGAGTTC CTTGAGCACG AACTCGGTCC 

20951 GGAGGGCAGA CGCAAGAGCG TGATCGTCTG CGCGAC'CAGC GACAAGTCCT 

21001 CGATGGAGCG TGCCAAGGCG GCGTACGTCG CAACCGCCAT CGCCGAATAC 

21051 TTCCGCGATC AAGGGCAGCG TGTACTTTTT CTGATGGACT CGGTCACCCG 

21101 CTTTGCGCGA GCCCAGCGTG AAATCGGCTT GGCGGCAGGC GAGCCGCCGA 

21151 CGCGGCGCGG CTATCCACCG TCGGTGTTCG CCACCTTGCC CAAACTGATG 

21201 GAGCGCGCCG GCATGAACCA GACGGGTTCG ATCACGGCGC TGTATACGGT 

21251 GCTGGTCGAG GGGGACGACA TGAACGAACC GGTGGCCGAC GAGACGCGTT 

21301 CGATACTGGA CGGCCACATC GTGCTCTCGC GCAAGCTGGG AGCGGCGAAT 
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Figure 5 (continued) 



21351 CACTATCCTG 

21401 CGTGGTGTCG 

214 51 TGGCCAAGTA 

21501 CAGGGCGCCG 

21551 CAATGCGTTT 

21601 CCGTACTGCG 

21651 AAAGCCTGCT 

2 1 7 C 1 CTGAAACGCC 

217 51 GGCGCAAGGC 

21801 ACCGTCTATA 

21851 GACGAGGTGC 

21901 GGCGCTGCAG 

21951 TGCTCGCGCA 

22001 CGGATCGCCG 

22051 CG AGAGCC AG 

22101 GTGGGCGCGA 

22151 ACCAGCCAGA 

22201 GGCGTGGCGC 

22251 TGCCTTCGAG 

22301 GGCCGCAACG 

22351 ATGGAACCTG 

224 01 AAGCGTGGCT 

224 51 AAGCCCGTAT 

22501 GTACGGGAGG 

22551 GGACGCTTGC 

22601 TCGCGCTGGA 

22651 GAGCCGCACG 

227 01 GTGGCCGGCG 

22751 TCCGCGGTTG 

22801 ATGGCGCGCG 



CCGTCGACGT GCTGGCCTCG 
CCGCGTCACA AGTACCTGGC 
CCAGGATGTC GAGCTGTTGG 
ATGCGTCGAC CGATGAGGCG 
CTCAGACAAC TAACCGACGA 
CATGGCTGAA ATCATCGGAC 
TGCCATCAAG CATTTTCGCG 
AACAGCAGGC CTGCGCGGTT 
CGCCTCGACG ATTGTCGCCT 
TGCCGAGCTG TGCCGGCGCA 
TGCAACGAGT GGGCCACGCC 
CTCGACGACG CCGTGCGCCG 
GCAGCGCGAG CAGCACCGGG 
AGTTGGTGCG CCTGCAGCAG 
GAAGATCGCG AAATTCAGGA 
CGATGCATCG CGAGCCGGCG 
CGGGCTGGGT TCGCCCATGG 
GCACGCCGTA TGCGCGTCAG 
CGGGAAATGG AACAGGAGAA 
CCTGGCGCCG GGTCCGGCCT 
CCGCCGGCCG TCCACCGGCC 
GCGGGGCTGG CGGTAGGCGA 
CGTTGTGGAC GATACGCTGC 
ACGGCGGCTG GATCGTGGTG 
GAGCGCCTGC ACGCGTGCGC 
GCTGGCGCGC GACGTCGAGG 
AGCGGGTGGC GCGCGCGCAG 
GGGCGGCGGC GCAGGCCGCT 
AGCGCCGGCG AGGCCCATGC 
ATTCGACGTT CGGCTTGGCG 
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GCCAGCCGGG TCA7GAATGC 
CGGACGTATG CGCGAACTGA 
TGAAAATCGG CGAGTACAAG 
ATACAGAAGA TCGGACAGAT 
ACGCGAAGCA TTCGAGGATA 
CCGAATCCTA ATGGACCTGG 
CCGACCAAGC CCAGCTTGCG 
GCTGCCGCGG CGCAGCGTCA 
GTGGGCCGGA CAGCTCG AAA 
TCGTCAAGAC ACGCGACATC 
CGCGACCGCC AGGCCAGCCT 
TCACGAACAT GAAATCCAGC 
AGTGCTTCCA GGCGCAGCAA 
GTCGAGGCGG CGGCCTTGCG 
AG CC ATCGAA TTGTCGGCGC 
ACGGCCTGGC GCGGCTATGA 
CCGGCGGCGG GCAGCGCATG 
CCGGATCGGG ATGCGCAGCG 
AGCGAAGGAA GAACTGCCCG 
GCGTCGGCTG GCTGGCGTCG 
AGTCTGGCCC AGGCGCTGGC 
CGTGCTGGAG GGGTATCGCG 
TACCCGACAC CACCTTGTCG 
GCTTTCGCAT GCCGACAACG 
CGACCGGTTG GCCATGGAGC 
TTGCGGTGGC ATGCGACGGC 
CGGCCGTGGC GATGAATCGA 
GGCATGGTGG ATCTCGCGGT 
CCTGTCGAGG ATTGCATGCC 
AGCCGGCCGT GCGCTGGCAC ' 
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Figure 5 (continued) 

22851 TGCGCCCTGA CGCCTTGCGT GCACGGCGAC CTTGCCGATG G C G AG AT G G A 

22 901 AAGCCTGCAA CTGCAATGGG CCGGGACGTA CATCGGCCTG ACGGTTCCGC 

22951 GCGCGGCCGC GGCGGGATGG CTGGCGGCGC GCCTGCCCCG GTTTTCCGGC 

23001 GTGGAGTTGC CGGAACCCAT TGCGGCGGCG GCCCTGGAGG CAATGCTGGA 

23051 GGAGGTCTGT CGAGGCGTGG CCGGACTCGA CCAGCAAGGC CCGGTCCGGG 

23101 TGGCGCGGCA AGGCGGGACG CC ACCGGTCC AGCCGCATCG CTGGACCCTG 

23151 ACGGTACGGG CGCCTGACGG TGGCGTCTGG CGCGCGGTAC TGGCGTGCGA 

23201 CGCATGGGCC TTGCAAGCGG TCGCGGCGGC GCTGGATTCC GTTGCGCCTG 

23251 CCGATGGTCG GGTCAATCCG GAGCGCGTGC CGGTCAGGTT GCGTGCCGAT 

23301 GTCGGCGCGG CGTCCGTGAC CGCAGGCCAG CTGCGGACGC TGCGAGCGGG 

23351 CGACGTCGTG TTGCTCGCGC AGTACCGGGT GAGCGATGCC GCAGAACTAT 

234 01 GGTTGTCGGC CGGACCCAGC GCGATCCGGG TACGGGCCGA GCATGCGTCT 

23451 TTTCGTGTAA CTCAAGGTTG GACTCCCATC ATGACGGAAC CCGCGACACC 

23501 TGACCCTGGC GAAACCCCGG CACAGGCCGA CGCGACGCTC GATACCGATC 

23551 AGATACCCGT GCGCCTGACG TTCGACCTGG GCGAGCGCGA GTTCACGCTT 

23601 GCGCAGCTGC GCAGCCTGCA TCCGGGCTGC ACGTTCGACC TCGAGCGGCC 

23651 CATCGCCGAC GGGCCGGTCA TGGTGCGGGC CAATGGCCTG TTGCTGGGCA 

23701 GCGGCCGGCT GGTCGACATC GACGGCCGCA TCGGCGTGGT ATTGCAGTCG 

23751 GTCAGGCCTG GACTCGCATG AGCGATACCG ACCCCTTCAG CCTGGCCCTG 

23801 TTTCTGGCGC TGCTGGCGCT GGTACCGCTC ATCGTCGTCA TGACCACGTC 

23851 GTTCCTGAAG ATCGCCGTCG TGCTTGCCTT GGTGCGCAAC GCCCTGGGAG 

23901 TGCAACAGGT ACCGCCCAAC ATGGCCCTGT ACGGGCTGGC GCTTATTCTT 

23951 TCCGCGTATG TGATGGCGCC GGTCGTTCAC AGGATAGGCA CCGA.GGTCCA 

24 001 GGCCTTGACC GCGCAAGCCG GGGAGTCCGG CACCGCCGCG CCGATGGCGC 

24 051 TGGACGCCGT GCTTGGCGTG GCCGAGCGAG GCGTGGGGCC GCTGCGGGCC 

2 4101 TTCATGTTGC GCAACAGCCA GCCGGCCCAG CGTGATTTCT TCCTGCGCAC 

24151 AGCGCGTCAT CTCTGGGGCG AGGAGGCATC GCGGGACCTG TCGGAAGACA 

24201 ACCTGCTGGT ATTGACGCCC GCATTTCTGG TTTCGGAGCT GACCGCCGCA 

24251 TTCCAGCTTG GCTTTCTGCT GTACCTGCCG TTCATCATCA TCGACCTCAT 

24 301 CGTATCGAAC ATTCTTCTTG CCATGGGAAT GATGATGGTT TCTCCCGTGA 

24 351 CGATCTCCAT GCCGTTGAAG CTGTTCCTGT TCGTCATGGT GGACGGCTGG 
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Figure 5 (continued) 

244 01 ACGCGCCTGA TCCAGGGCCT GGTGCTTTCC TATCGGTGAC CAGCATGCAA 
244 51 ACCCAAGACC TGGTTTCGTT CAT G AC AC AG GCGTTGTACC TGGTGCTCTG 
24 501 GCTGTCGCTG CCGCCCATCG CCGTGGTGGC GATCGTGGGA ACGCTGTTTT 
24 551 CCCTGTTGCA GGCCTTGACG CAGGTGCAGG AGCAGACCCT GTCCTTCGCC 
24 601 GTGAAGCTGA TAGCCGTGTT CGCCACGCTG ATGCTGGCGG CCCGGTGGAT 
24 651 AAGCGCGGAA ATCTATAACT TCACGATTGC GGTGTTCGAT GCCTTTCATC 
247 01 GGATCCACTG AGCGGCCAAT CGATGCACAC GGAGTTCAAT TTCGTCGAGG 
247 51 CGAAGGTTTT CCTGGGAACG CTGGCCATGA CGCAACCGCG GATACTCACG 
24 801 GCCATGCTCT TTCTGCCGAT GTTCAACCGT CAGTTTCTGC CTGGTCCGCT 
24 851 GCGTTACGCC GTCGGCGCCT GTCTCGGGCT GATCGTGGTT CCCC AGCTGG 
2 4 901 C GCCGC AG T A TGCCGCGCTG GATATCGACT GGCCCCGGCT GCTGGCGCTG 

24 951 CTGGCCAAGG AGGCGATGGT GGGCATGTTC CTGGGTTGGC TGGCTGCCTT 
25001 GCCATTCTGG ATCTTCGAGG CCATCGGCTT CGTCATAGAC AACCAACGGG 
25051 GCGCCAGCCT GGGCGCTATC CTCAACCCCG CCACGGGCAA CGATTCGTCG 
25101 CCCATGGGCA TTCTCTTCAA TCTGGGATTC ATGGTGTTCT TCCTGACGGC 
25151 GGGCGGATTC GGGTTGTTCG CCACGATGCT GTATGACAGC TTCGGGTTGT 
25201 GGAACATCTG GGCGTGGTGG CCGTCCATGC CCGCACAGGG CGCCGTGCGG 
2 5251 ATGCTGGACC AGTTCAGTGG CTTTGCCGCG CGTGTCCTGC TGCTGGCCTC 
25301 GCCGGCCATC GTGGCCATGT TCCTGGCCGA GCTGGGCCTG GCCCTGATCA 
25351 GCCGCTTCGC GCCTCAACTG CAGGTGTTCT TCCTGGCTCT GCCGGTAAAG 
254 01 AGCGCGCTGG TGCTGTTCGT GCTGGTGCTG TACATGGCAA CGTTGTTCCA 
25451 GTATGCAGGC GAAATCCTGG GTTCTGTGGG CCGGATCGTG CCGTTCCTGC 

25 501 ATTCAGCGTG GCCCGGCCCA TGAGCGGCGA GAAAACCGAG CGGCCCACCC 
25551 CGAAGCGCCT GCGCGATTCC CGCGAGAAAG GCGAGGTCGC ACACAGCCGG 
25601 GACTTTACCC AGACGGCGCT GATATGCGCC TTGTTCGGGC ACTTTCTGAT 
25651 CAATGCCCCG TCCATTCTCG CGTCGCTGCG AGCGCTGATA CTGGCGCCGG 
25701 CGGCCTTTGC CGACCAGGGG TTCGCCGTCG CATTGGGGCC CGTGCTGACG 
25751 GAAATCCTCG ATCAGGCCGT CCGCGTGCTC GCTCCGCTGA TTCTCATCGT 
25801 GCTTGGGGTG GGGATGTTCG CCGAATTCCT GCAGGTAGGC GTCGTGCTGG 
25851 CGTTTCGAAA GCTCAAGCCT TCGGCGGAGA AACTGAATCC CGCCGGCAAT 
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Figure 5 (continued) 

25901 TTGAAGAATA TCTTCTCGGC GCGCAACCTG ATGGAGTTCA TCAAGTCGGT 

25 951 ATGCAAGATC CTGTTTCTGG CGGTGTTGGT CACGTTGGTG ATACGGGATT 
26001 CCTTGCAGCC GCTGATGGCC GTTCCCCATA GCGGGCTGGA CGGGTTGCGA 
2 6051 ACGGGCGTAG GCCGCATTCT GCAGGTCATG GTCTGGAACA TCGGACTGGC 
26101 GTACGGGGCG ATTTCGCTGG CGGACCTGGC CTGGCAGCGT TACCAGTATC 
2 6151 GCAAAGGCTT GCGGATGAGC AAGGACGAAG TGAAGCAGGA GTACAAGGAG 
26201 ATGGAAGGCG ATCCCCATAT CAAGCAGCAA CGCAAGCACC TGCACCAGGA 
26251 GCTGATCATG CATGGCGCGG CGGCCCAGGT TCGCCGGGCG ACGGTGCTGG 
2 6301 TGACCAATCC GACACACCTG GCCGTGGCCC TGTACTACGC GGCGGGCGAG 

26 351 ACGCCCTTGC CGCGCGTGCT GGCCATGGGG CAGGGAGCCG TGGCCGCTCT 
26401 CATGGT CG AG GCCGCGCGCG ATGCCGGCGT GCCGGTCATG CAGAACGTCG 
26451 CGCTGGCCCG CGCCTTGCAC GACCAGGCGG AGGTGGACCA ATACATTCCC 
26501 GGCGAGTTGG TGGAGCCGGT GGCCGCGGTG TTGCGGGCGG TGCGCCAGGC 
2 6551 ACTCAAGGAG C AG AC AT G AC AGCAACCATT CATCCCGATA TTGCCGATTA 
26601 TGCGCGACGC CATGGCCTCG AACCCTCGGT CGACGCCGAT GGCGGGCTTG 
26651 CCGTCCGGAT CGACGGACGG CATCGCGTCA GGTTGATCCC CGCCGAAGAC 
26701 GGCATGCTGG TGTTGCGGGC GCGGCTGGCC GAGCTGCCCG ATGGGTGGCA 
2 6751 GGCGCGCGCG GCGCAGTTGC GCCGGGCGGG CCTGCTGGCC AGCGCCATGG 
26801 CCCCTGCGAC CGATGCGTAC TGCGGCATAG ACCAGGGCGA AACCGCGTTG 
26851 TATCTGCACC AGCGCGTCGC ACCGGCCGGC AGTGCGCTGG CGGTGGACGA 
26901 GGCGGTGGGC GAGTTCGTCA ATGCCTTGGC CACTTGGAAA AGGGCGATGG 
26951 CGCAATGGCA AT AGGTCGGC TTGGGTATCT TGTCCGCGGC GCATGGGCCG 
27001 GGGGTGTCAT GCTGTTGGCG GCCGGTAGCG CCTGGGCGGC GCCGAACTGG 
27051 CCTTTGGCGC CGTATAGCTA CTACGCGCAG CAGCAGAGCC TGTCCGATGT 
27101 GCTGCGCGAG TTCGCCGCAG GCTTCAGCCT GGCGTTGCAA CAGGGCAAAG 
27151 GGGTGCAAGG CGTGGTCAAT GGGCGTTTCA ATGCGCGCAC ACCCACGGAG 
27201 TTCATCGAGC GTCTCAGCGG CATCTATGGG TTCAACTGGT TCGTGCATGC 
27251 CGGCACGCTG TATGTCAGCC GCACCAGCGA CGTGGTTACC CGCGCGGTGG 
27301 ATGCAGCCGG CGCTTCGCCG TCGGCGTTGC GCCAGGCCTT GCTGCAACTG 
27351 GGCATCCTGG ACGAACGCTT CGGATGGGGA GAGCTGCCGG CGCAAGGCGT 

27 401 GGCCATGGTG TCAGGGCCGC CGGCCTATGT CGCGCTGGTC GAGCAGGCGG 
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Figure 5 (continued) 

274 51 TAGCGGCGTT GCCCAAGGGG GCCGGCAATC AGCAGGTGGC GG7GTTTCGC 

27501 CTCAAGCATG CTTCCGTGAG CGAGCGGGTG ATCCGTTATC GAGACCAGCA 

27551 GGTAGTTACG CCGGGGATGG CCACCATGCT GCGCCAATTG ATCCTGGGGG 

27 601 CGGGGCCGGG CAACGACGCG GCGCTGGCCG CGGTGGCGGC GCGGCTGCGG 

27 651 GAAAATCCGC CGGTGTTCGG CGATGCGGCA GCTGACGGGA ACGCGCCGCT 

27 701 CGCTGGCGCA GCCCAGGCAG CCGGCCGGCG CCTGAGCGAG CCCAGCGTGC 

277 51 AGGCCGACAC GCGCCTCAAT GCCTTGATCG TGCAGGATAT TCCCGAACGG 

27801 ATGCCAATCT ACCGTGCCCT GATCGAGCAG TTGGATGTGC CCAGCACCCT 

27 851 GATCGAAATA GAGGCCATGA TCGTGGACGT CAATACCGAT CTGGTCAACG 

27 901 AGCTGGGTGT CACCTGGGGG GCGCAGATCG GAACCACCAG CCTGGGCTAT 

27 951 GGCGATCTGG GGCTGCGTCC CGGCAACGGC CTGCCCGTGG ACGGCGCGGC 

28001 GGCCGACCTG GCGCCCGGAA CCTTGGGGAT CAGTGTCAGT ACCCGGCTGG 

28051 CGGCGCGCTT GCGTGCGTTG GAGTCGGACG GGCAGGCCAA TATCCTGTCT 

28101 CAGCCGTCCA TCCTGACCGC CGACAACCTC GGCGCCATGA TAGACCTGTC 

28151 GGATACCTTC TACATTCGCA CCCTGGGCGA GCGCGTAGCG ACAGTCACGC 

28201 CTGTCACGGT GGGTACGTCG TTGCGTGTGA CGCCGCGCTA TATCGCCGCC 

28251 AAGGG AG G AC GCCAGGTGGA ATTGGCGATC GATATCGAGG ACGGACGGGT 

28301 CTTGCAGGAG TATCCCATCG ATGGTCTGCC CCGGGTTCGG AAAAGCAGCA 

28351 TCAGCACGCT GGCGGTGGTG GGGGACGAGC AGACGCTGCT GATCGGCGGC 

284 01 TACAACAATC GCCGTGACGA AGAGCAGGTC GAGAAAGTGC CGCTGCTGGG 

284 51 AGATATCCCC GGCCTGGGGT TCTTGTTCTC GAGCAAGTCC CGGGCGGTAC 

28501 AGCGCCGCGA GCGGCTGTTC CTGATCCGGC CGCGTGTCGT GGCTATCGAG 

28551 GGCAAGCCGG TCTTCAGCCC CGTTGCGGGC ACGTCGCAGG TGTTCATGAG 

28601 CACGGGTTGG GGCGGGCATG GCAGCAGCCT GAGCATTGCA CCCGGCGAGG 

28651 GCGGGCATAC ACAAGTGCGT CATGATGCCC GGGCGGGCAG GCCGGTCCGG 

28701 CTGGTGCCGG ATTCATTGCA TGTGGAGTAT GGCGAGGCGG GGGAGGCGTC 

287 51 GCCCTGAGCG TGGCCCGGGC AGGGGGGCTA CGGCACGCTG TCGTAGCCTC 

28801 GTTCGCGCAG CGCTTCGCGC AAGGCACAGC GGGCGCGGGA AAGTCGGCTG 

28851 CGAATGGTTC CTACAGGAAC CGAAAGCAGT GCGGCAGCCT CTTCATAGGA 

28 901 GAGTTCTTCC ACGCCGACCA T G AG GAT C AC GTCGCGCATG CTTTCGGGGA 
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Figure 5 (continued) 

28 951 GCTGCTCCAG CGCTTCGCGT AGCAAGCGCA TGCGTTGACG CTGCTCGGTC 
29001 ACGGCTTCGG GGTTGGGCGC ACTGCATGGC ATGACACCCA GGCTGGCGTC 
29051 GGAGGTGAAT TCATAACGGC GCTCTGGCGC ACGCGACAAG TGATTGCGGA 
29101 CCAGATTGAG CGCGATGCCG TACAGCCAGG TGGAAAGCTG GGAGTCGCCA 
2 9151 CGGAACGATT GATACGCGCG CGCCGCCTCG GCGAAAGCCT GCTGCGCAAG 
29201 GTCCTCGACG TCGCTGCTGT GGCCGATGTG CTTGGCGATG AATCTCTGTA 
29251 ATCGGCCGGC ATGTTCGGCT ACCAGGCTGC TCAGCAACTG CTGGTCCGAT 
29301 CGCTCGGTGG CGATATCGGG ATATGCGTTG TCCGGTTTTT CGAGAGAAGC 
29351 GGGAAGACCG GATGACGGGG GAGAAACCAT GCAAAGCGAT ACCAAGTGAA 
2 9401 AGGGTGATAA TTCACGTCAC CAAGATACTG ACTGCCGGTT TTATCCGGCA 
29451 GTTGTTAACT TCCGAAACTA ATGTCGGATC GCGGTCGCTA CCGGAGCATT 
29501 CAGATACAAC GCGCTGAACG TCTTCCGTAA AACTTACGAC GCGACGTATG 
29551 GGGACTACAC CAGGTGGTGC GCAAGGACCT CGCGCAACCA TTCTTGCGCC 
29601 GTATGGGCGG ATACCGGCGC CTTACCTGAT TGTGCGGCGG ATGCGAAGCG 
29651 CTGCTGCATG CTTTCGTCGA TCGTGGCGCG CAACGTTCCG GATATCTGAT 
29701 CGGGAGAGAC ACCGGTGTCG TGACAGACCT GTCGGAAGCC GGCGGCGTCG 
29751 GCGCGGGCCG ACCAGGACAG GAAGGTAAGG AAATCCACGC CCTGCAGGGC 
29801 GTGGCGTGAG GTTTCCGCCA TGTCGACCGC TCGCGTGACG ACGCGCGCCG 
29851 ACAGCGATGT CGTCGGGACC GCGTGCAGAT CGAGCTCCTT GGCGACGGCG 
29901 CGGGCGATTC CCGTGCCGTA GGCCAGCGCC AGCGCGCTGG AAAACATGAC 
29951 AAGCGTGTCC TCGTCGGTGG CGACCCAGGA TACGTTGCGC CCTGACGGCG 
300 01 TCGTGCCGGA TGCGATGACC TGGAACTGCT CGCCTGCTTT GGTGACATAT 
30051 AGGGTCTCTC CTTGGCTGGC GGCGCGCGCA AAGACATCAA GCTCCAAAGC 
30101 AGGTAAAGCG GGTGGGATCT GGAAGTTCAT GCGGTGGGCC GTTGTCTCGA 
30151 ATCCTTGGGT GTATCGCGTT CTACGGAGCG GGAACATGGA AT ATGCACT G 
30201 TGTGACGCTT TCGGCTCTTC GGTTCCATGC CGGCATGACA AACCCGACGG 
30251 CATGCAGGCC TCGCTCCCGT CCGGCCACCG AGGCTCGCCT TGAACGCTGC 
30301 GCAAAGTGTC GGCCATACGC TGGGCCTCGA GCTGGAAATG GTGGTGGCCT 
30351 GTCGTGCGAC CGGGGCCAGC CATCCGGTGG CGCGCCATTT CGAAGCGCTG 
30401 CGCAACTTGC GCCGTCAACG CGGAGAGTCC GTGCAGGAGT ACCGTCTGGA 
30451 CGCGCGCCTG TGCGGGGGGT GGGCGGTCCC CATGGCCTGA GCGGGCTTGA 
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Figure 5 (continued) 



30501 TAACGGCTAT 

30 551 CGGGAGGGCT 

30601 ACGCAGCTCG 

30651 GCACCCCGCC 

307 01 CGCGGCCCAT 
30751 ATCGGGATAG 

308 01 GGCCATCGCC 
30851 AGATCGCCAT 
30901 CTCAAGGAAA 
30 951 CTACCTGGGC 
31001 ATCTCGGCGA 
31051 GCGCTACCGC 
31X01 CCTGGTGGGA 
31151 CCGCGCGAAA 
31201 GAACATTTCG 
31251 CTACAGG AT G 
31301 ACAGGCAGGG 
31351 TACATCGAGG 
314 01 GAGCTCAGCC 
31451 CGCTGCAATT 
31501 AGGCGATGGG 
31551 TTTGGCGTTG 
31601 CGGTAGCCGA 
31651 GTGCGTTACG 
31701 CTTGTGGCGC 
31751 GCCGGCAGCG 
31801 CTCGTTGGCC 
31851 CAGCAGAATA 
31901 GATTGATGGC 
31951 GGCTTTTCCC 



AACCTGCTCG AAAC C G CAT T 
GCGCCGACTG GCCGAAGCGG 
CCCTGGCCGC GGAGGGGGCG 
GCCAGCCTGG ATGCCGACTG 
CTACGAGGAA CTCGTCGGCC 
ACGCCAAGGC ACAGAACAGC 
GCGCGCTGCC TGAACGTCGT 
GTTCGCCAAC AGCCCGCTGG 
ACCGCCTGAC CCTGTGGCCG 
GACGACCTGC TGCATCGCCT 
TTATTTCCGC TGGATGTTCG 
CGGGCGACGC TTGCGACTAC 
GCCCCTTCGC TGGCAGAGTT 
CCTGAATGAT GGCGGTTCCG 
TCTATTCGCA GTTCGCGCAG 
CCGATTGTCC CCGCCTTGCC 
CGGCCTGGAA GCGCTGTTCG 
GGCGCGCGCC GGGCGCGGTA 
GGCGATGCAG TCGCGGCCAG 
GGGGCTGTTG CGCAATCTGC 
GCTGGCTGCG CTTGCGTGCG 
GACGATGCGC AGGTGCGCTG 
AGGCGGGCTG GCCGGCGACG 
TGGTGGAAAC CGGCGAGACC 
CAGGCGCGCG GCACGCCTGA 
CGCGGTGCTG TCCTAGGATC 
AGGCCTGTCA GTGCTCCGCT 
TCCACCGGTG GGTCGAAGCC 
TATCTTGGCG GGGGC ATTCC 
CATTGAAAAT GCGGGCAATG 
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CGCGCCTGTG AACGGCGGCG 
TGCGCCGCGA GCTGCGTGAC 
ATGCTGATCA ACGCGGCCGA 
GTACCGGCGA GTGCGGGTGC 
AGCGAGGCTG GCTGCACCGG 
CCCTGCACGT CGGTTCCCGT 
GCTGGCGCTG GCTCCCGCGC 
AGGCAGGGCG GGTGACCGGT 
CGCATGTTCC GAGGCGCGCG 
GCCTGCAAGG CCGTTTCGCG 
GCGGATTGAC CGCCAGCCGG 
AAGAACGCCG ATGTGGCCTG 
CCTGTATGCG GGCGCGTGGT 
TGCGTCTGGC CGCGCGCAGC 
TTCCT GGACG CGCGTTGGCG 
GGGGCTGTTG CGAGCCTGGG 
AGCAGGCCGG CGCGCAAGGC 
TTTGCCGATG CCGACTTGCT 
TGCGCCGATG GCGGCGTCGG 
ACGACGCCGA GGCCCTGGTG 
TTGCGCGATC GGGCCATCGC 
CCTTTGCCAA CAGGTCGTGG 
AGCAGCAATG GCTCGATTAT 
GCCGCGGACC GCATGCTGCG 
GATGCGCCGC GCACAGGCGT 
CGGCACATTC CTTGCCAGGT 
TCGTACACCT CGTCGGCCGC 
GATGCGCCGC GCCGTTTCCA 
AGACCTGGCT GATGCTGCGC 
GTCTGGGCGT GGAACATGCC 



WO 00/37493 



PCT/EP99/10297 



Figure 5 (continued) 



32001 


TACGCTGGAG 


TAGTCCGCCT 


TGGCCAGGCT 


CATCAACAGG 


CCGGCCTTGA 


32051 


CCTCGTCGGA 


GCCTTGCATC 


GAGAAACTCG 


GCACGCGGGC 


GGCGCGCAGC 


32101 


AGCGCGGCGA 


GCTGCTTGAC 


GGACGTCGAG 


GTGATGCCCC 


GGTGCTCGGT 


32151 


GACGTAAAAG 


GCGTCGACTT 


CGCTCGACAG 


CTTCTGGTAG 


CAAGCCAGAA 


32201 


CGTTCTGGGT 


TGCCGTGGCG 


ATGGGGATGC 


CGGTCGCGCG 


TGCGTCGCAA 


32251 


CGCTTGACGG 


AGAAATCCAA 


TGCCGGCATT 


AGTGCGGCGA 


CCTTATCGAT 


32301 


GGCTGCGTAG 


GTGCGACCTG 


CTTCGGTGTC 


TTCGTAGACC 


AGTCCAAGCG 


32351 


TCTTGAACGG 


CACGATGTCA 


TGGAGCAGCT 


GGATCTGCCG 


CTGGTAGTGG 


32401 


TCGGGCTGTA 


CCCGGGCATG 


CAGGTTGTCC 


TGGCCGCTGT 


CGGCCGCACT 


32451 


GGGTATGATC 


CGGGCGCTTA 


TCGGGTCGGT 


CGACGAGACG 


ACCACGGTGG 


32501 


GTACCGGCGT 


GCCCAGTTCG 


ACCATGTCCT 


GTCCAGCCCA 


GGTACCCATG 


32551 


GCGATGATCA 


GGTCGATGTC 


CTTGGCGCCA 


TGCAGGCGTG 


CCGCAACGGC 


32601 


TTCGCGCACG 


GCAGGCCGCA 


AGGCGGTGTC 


GAAGTTGCCG 


GGCTGCCACC 


32651 


ACGCATCGGG 


CACGAACTCG 


ATGTAGTTGC 


TGCGGGCATG 


CGTGGCCAGG 


32701 


TAAAGCCAGG 


CCTTTCGCAT 


ATCGGTTATC 


TCGGGCATGT 


CGTCGATACG 


32751 


CAGCCATCCG 


AGTTGTTGCA 


ATGCGCGCGC 


GATCGCGTAG 


AGCGTGCGCG 


32801 


GATACTCCTC 


GTACTCGCCG 


CTACCCACAT 


AACCGATGCG 


CCATTTCCGG 


32851 


CCGGATGTAT 


.GGGAGGGAGG 


CGGAAGGCGG 


GGCGAGGTAA 


GGGCGACGCC 


32901 


GGAGCTCAGG 


GCCGCGACAG 


GAGGAGGGCT 


GGATGCCGCC 


GCGATGGGCC 


32951 


ACGCGAGGCC 


AAGAAGCAGG 


GCGAGCGGGG 


CGAGTATTCC 


GGGGCGAAGG 


33001 


GTCATGGGCG 


ATGAATGGCG 


ATGATGGTGA 


GATCGTCGGA 


TTGTTCGGAA 


33051 


TCGGCGGCGA 


ATTCGCGTAG 


GTCGTGCAGA 


ATGTGCTCGA 


TGAGTTCGGC 


33101 


CGCTGCGTGC 


GGCGCGCCCT 


GC AT CAGGGC 


GACCAGCCGC 


GGCA.GACCAT 


33151 


ACTGGGCGCA 


GCCGCCGTGG 


ATGGCTTCGG 


TGACGCCGTC 


GGTAAACGCG 


33201 


ACCAGCGAGG 


TGCCGTTCGG 


CAAGGTGGTG 


CTCAGGGTGG 


AATACGCCTC 


33251 


GTTGTCCAGC 


ACGCCGCAGG 


CCGCGCCCCT 


GCTTCCTTGA 


AGCAGGCGGA 


33301 


CCTCGCCACG 


TTCGTCGATG 


AGCAGCGGCG 


GCGGGTGGCC 


GGCGTTGACC 


33351 


CAGGCCAGGG 


CGCCTGTTTC 


CGGGGTGAAG 


ACGCCTATCA 


GCAAGGTGAC 


33401 


AAACATCAGC 


TTGGGGTTGT 


TCTCGGCCAG 


ACGGTGGTTC 


ACCTTGGTGG 


33451 


CGATGGCGCC 


CGGGTCGTGC 


TCTTCTTCCG 


CCACGCTGCG 


TATCAAGGTC 


33501 


CTGACGATGG 


CCATGAACAG 


GGCCGCGGGC 


ACGCCTTTTC 


CGGATACGTC 
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Figure 5 (continued) 



33551 


GCCGATGGCA 


AAG C AC AG AC 


GCCCGTCTGC 


CAGCACGAAG 


TAGTCGTAGA 


33601 


AATCCCCACC 


GACCTCCCGG 


GCCGGGTACA 


TGACGGCACG 


CAACTGGCTG 


33651 


CCGCGCGTGG 


CCGCATCGGG 


CAACGGCTGG 


GGAAGCAGGC 


CAAGTTGGAT 


33701 


GGAGCGGGCG 


ATGCTCAATT 


CGCTTTCGAG 


GCGTTCGCGG 


TTCGATATCT 


33751 


GCGCCATCAG 


GGCCCGCACA 


TTGTGGTGCA 


GCTGTTCGTT 


CATGAACAGG 


33801 


AACGATTCGG 


CGAGCTGTCC 


GACTTCGTCG 


CGCCGCCGGC 


GCGGCAGGCA 


33851 


TGCCACCGAC 


GGCGGAACCC 


GGATCGGCTC 


GGTGAGGTCC 


TGGGTGGGAA 


33901 


GCTGGCGAGC 


GTAGTTGCTC 


AGTTGCGCCA 


ACGGCCGGGC 


GATGCGCACC 


33951 


GCCACCACCC 


ATGCCAGCAT 


CAGCCCGGCC 


AGCAAGGTGG 


CGGCGAAGAT 


34001 


CAGTGCCTGC 


CGGCGCACCA 


GATTCTGTGC 


CGGGTCGGTC 


AGGTCCGGCT 


34051 


CGGGAACGAC 


ACCGATGATG 


GTCCAATGCA 


GCGGCTTGTA 


TCGCAGGGCG 


34101 


TCGATCTGCC 


AGGCGCTTTC 


GCCGTTGGTA 


AAGCGCAACG 


TCAGGCCGCG 


34151 


GGTAGACGAG 


ATTTCGGCAA 


GCATCGAATG 


CAATACCCGT 


CCCGATTCGA 


34201 


CGTCTGTCGA 


GTCCAGCAGC 


CGGGCGGCCG 


ATGGGGGTGG 


CGGCACGATC 


34251 


ACCGTGCCAT 


CGTCCGCAAC 


CACGAACACG 


AAACCATGGC 


GGCTGAGCCG 


34301 


CAGCTCCGAC 


AGGTTCCGGT 


CTATCGCGGC 


AATCATGTTG 


GCTTTCTGGG 


34351 


CGGCAACCTT 


GTCGATGATG 


GCTTGTGAGC 


TATCGGAGAT 


GGCGAGAACC 


34401 


CACTTCCACG 


CGGGGAAGTA 


CACGAAATAG 


GCGTGTCGCA 


TCTGGGCGGA 


34451 


CTCGTCCAGG 


GGAGACGGGT 


AGATGGCGAA 


GCCGCGACCG 


TCGTTGCGGC 


34501 


TTTCCTCGTA 


CATGGCGGCG 


GCGAGCGGCC 


GGCCCTTGAA 


GTCGCGGATC 


34551 


CCGGAGAGGT 


CCCGGTCGAT 


CATCCGGGGG 


TTGGTGCTGG 


CCAGCACGGT 


34601 


GCCTTCCGCG 


TCATAGGCGA 


AGGCGACGCG 


GCGCGGTCCC 


AGGTCGAGAT 


34651 


GGTTCAGCCA 


GACACGCGCC 


ATGCCCTTGG 


CGGCGCCGGT 


AGTGACGTGT 


34701 


CCGCGCTCGG 


CCTGCGCGGC 


ATAGGCGTTC 


AGCACCGATG 


TCACGACCGC 


34751 


GCTGAGTTGG 


ATCAGTTGCC 


TGCGGCTTTC 


CCGGATGGTG 


CGGATCTTGT 


34801 


CGTCGAGCAG 


CGTGGACCAG 


CGCGTGTCGG 


TGTCACGAAC 


TACCAAGTCC 


34851 


ATGATGTTAC 


TGACGGCATG 


CAGTTCGTTC 


TTGATGATGT 


TGTTCGTGAC 


34901 


ATCGCGCTGG 


GTCACGAGCA 


TCACGACGAT 


GCCTACGAGT 


AGCAGCGTGG 


34951 


ATGCAATGAG 


CAGGAGGAAC 


TTCCCACGCA 


ATGAAAGCGG 


CAACTCTAGC 


35001 


CGGGGAGTAC 


GGCGCATGAA 


CATGAA 







27/28 



WO 00/37493 



PCT/EP99/10297 



kDa Orf2 Orf4 Orf6 OrflO 




FIGURE 6 



28/28 



SUBSTITUTE SHEET (RULE 20) 



Docket No/. B45168 



DECLARATION AND POWER OF ATTORNEY 



PCT/EP99/ 10297 



As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below next to my name. 
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International application in the manner provided by the first paragraph of Title 35, United States Code, Section 
1 12, 1 acknowledge the duty to disclose information which is material to patentability as defined in Title 37, Code 
of Federal Regulations, Section 1 .56 which became available between the filing date of the prior application and 
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Intellectual Proprety-U.S., UW2220, P.O. Box 1539, King of Prussia, Pennsylvania 19406-0939, whose telephone 
number is 610-270-5024. 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on 
information and belief are believed to be true; and further that these statements were made with the knowledge 
that willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section 
1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 
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